Links and short descriptions of beautiful problems in computer science.
The beauty of computer science lies mainly on:
- its huge usefulness
- a few theoretical questions with deep implications
- the endless race for the ultimate algorithm that solves each problem
Things mentioned here might be repeated elsewhere in this repository: this is only a summary of the cream of the crop.
-
https://en.wikipedia.org/wiki/List_of_algorithms
Large list of important algorithms ranged by category.
Good place to look for cool algorithms.
-
Search for useful / beauty lists on Google: https://medium.com/@_marcos_otero/the-real-10-algorithms-that-dominate-our-world-e95fa9f16c04
- learn: websites, books
- have fun: games, music
Programs can help to:
- view: web browsers, PDF viewers
- edit: text editors, image editors
- transmit: web
See this.
https://en.wikipedia.org/wiki/P_versus_NP_problem
Easy to state, hard to prove.
Many other analogous questions like PSPACE vs NP.
http://en.wikipedia.org/wiki/Undecidable_problem
Mysterious feeling: there are problems that you can state, but not solve in general.
A recursive definition of computable for integer functions: https://en.wikipedia.org/wiki/Primitive_recursive_function#Relationship_to_recursive_functions Famous counter example which removes the "Minimisation operator": https://en.wikipedia.org/wiki/Ackermann_function, and falls under https://en.wikipedia.org/wiki/Primitive_recursive_function
1/0 knapsack can be solved in polynomial time in
http://en.wikipedia.org/wiki/FPTAS
We can write an emulator of a Turing machine as another Turing machine.
The simulator takes as initial tape:
- a description of a Turing machine
- the arguments for that Turing machine
and runs the described Turing machine on it's arguments, without ever overwriting the given machine description.
The question then becomes: what is the emulator with the fewest number of states possible: https://en.wikipedia.org/wiki/Universal_Turing_machine#Smallest_machines
15 states for a 2-symbol machine. TODO proved?
There are also 2-state 18 symbol universal machines with multiple symbols: http://cstheory.stackexchange.com/questions/10207/whats-the-simplest-noncontroversial-2-state-universal-turing-machine
There is a controversial claim of a (2,3) complete machine: https://en.wikipedia.org/wiki/Wolfram's_2-state_3-symbol_Turing_machine
So any further interesting result has to be for between 2 - 18 symbols and 2 - 15 states.
This is a list of cool problems and algorithms that solve those problems.
It is cool to understand how the following calculations can be implemented:
-
factorial
-
Fibonacci series.
Specially interesting to look at efficient different implementations and their efficienty:
-
simple recursive:
$O(2^n)$ time! -
dynamic programming with array:
$O(n)$ time,$O(n)$ memory. -
dynamic programming storing border only:
$O(n)$ time,$O(1)$ memory.
-
-
MCD, MCM.
-
Combinatorial optimization and satisfiability:
-
Knapsack. Tons of variants at: http://en.wikipedia.org/wiki/List_of_knapsack_problems
-
A monetary system has 3 types of coins: 3, 5 and 7. What is the largest ammount that cannot be attained?
NP-complete in the number of types of coins. For a fixed number of coins, polynomial.
For 2 coins, has an explicit formula.
-
What is the smallest number of coins needed to make a given value exactly?
NP-complete in the number of types of coins.
Like Knapsack but:
- unlimited supply
- each cost is 1
- must match exactly
-
Generalization of change making:
- limited supply
- the cost of each coin is different
-
Subset sum problem.
Is there a subset whose sum is exactly 0?
Subset sum problem.
http://en.wikipedia.org/wiki/Subset_sum_problem
There was even an early public key encryption system based on this problem, but it was broken: http://en.wikipedia.org/wiki/Merkle%E2%80%93Hellman_knapsack_cryptosystem
http://en.wikipedia.org/wiki/3SUM has a
$N^2$ algorithm conjectured to be optimal.
-
-
Sorting.
Compare all the sorting algorithms! Which is the best?
-
Heaps: the search for the ultimate heap: http://en.wikipedia.org/wiki/Heap_%28data_structure%29#Comparison_of_theoretic_bounds_for_variants
-
Graph algorithms.
-
BFS vs DFS. See queue vs stack duality of non recursive implementation.
-
Single source shortest path:
-
dijikstra 1959
$O(|E| + |V| log |V|)$ . Positive weights only. -
A* 1968. Heuristic improvement over dijikstra if a distance function exists.
-
bellman-ford 1965.
$O(|V|^2 log |V|)$ , but also accepts negative weights.
-
-
All pairs shortest path:
-
floyd warshall 1959 / 1962.
$O(|V|^3)$ , negative weights ok. -
johnson's algorithm 1977
$O( |V|^2 log |V| + |V||E|)$ , therefore better than floyd marchal for sparse graphs. Negative weights ok.
-
-
Hamiltonian Path Problem. Is there a path that visits all vertices? NP-complete.
-
Traveling Salesman Problem. Edges have weights. Find shortest path that visits all vertexes. NP-hard.
-
Route Inspection Problem. Find the shortest path that covers all edges of an undirected graph, possibly passing more than once over the same edge.
$O(n^3)$ . If an Eulerian Path exists, it is the trivial solution.If the graph is mixed, NP-complete! Beautiful, small problem difference, huge complexity difference. aka Chinese Postman Problem.
-
Vehicle Routing Problem. Generalization of Route Inspection, but edges have one or more weights, and multiple constraints are possible.
-
Several interesting concepts and conjectures.
Used for https://en.wikipedia.org/wiki/Register_allocation during compilation.
-
You can color a map with 4 colors in a way that no two countries share a border with same color.
Graph statement: every planar graph can be colored in a way that no two neighbours have the same color.
Conjectured in 1852, proven in 1976.
Proof uses a computer to check each 1936 base cases, first of its type.
Proof for the 5 color case much easier and completed in 1890.
Simple to state, hard to prove.
Complexity of finding a coloring:
$O(n^2)$ .
-
-
Self balancing search trees. RB-tree, AVL tree, B-tree.
Allow searches, insertions and other operations in
$O(log(n))$ .A sorted array only allows
$O(log(n))$ search, but not insertion.
-
-
Dynamic programming:
-
Solvable in
$O(n)$ . -
Solvable in
$O(n log n)$ .
-
-
Cool graph visualization that becomes a Sierpinski triangle, and which make crystal clear:
- uniqueness
- length of the optimal path:
$2^n - 1$ . This path is simple to compute.
Variations are also interesting:
-
Go from any initial configuration to any final configuration.
-
the graph shows that there can be only either 1 or 2 optimal paths.
-
the original problem of moving all pegs from one configuration to another is the problem that takes the longest time possible.
This time is also attained by any other problem on the opposite side of the sierpinski triangle.
Any starting position on the interior will lead to shortesr optimal paths to anywhere.
-
there is an explicit formula by Hinz and Chan Hat-Tung for the average optimal solution length if initial and states are chosen at random which is assymptotically:
$466/885 * 2^n - 1/3 + o(1)$
-
-
Any increase the number of pegs.
For 4 pegs, the problem is known as Reve's puzzle.
It is possible to do a brute-force solution by representing the problem as a graph and doing BFS, but this is of course exponential in memory and time.
There is no polynomial algorithm proven to compute the solution although the Frame-Stewart algorithm is conjectured to do so.
As a consequence, exact bounds for a given number of disks and pegs are unknown.
-
Pseudo random number generation.
-
Branch and bound with relaxation.
Relaxation means to suppose the problem can have fractional solutions.
It becomes then a continuous linear programming problem, and can therefore be solved in polynomial time.
We know that the actual solution cannot be better than the relaxed one, since the relaxed solution has many more possibilities, so we can bound what is the best possible solution.
This allows for more efficient cutting.
-
karp's 21 NP complete problems
Very famous list of NP complete problems you should know about.
-
Obvious practical applications.
Let:
-
$p$ be the pattern length -
$t$ be the text length
Naive algorithm does
$O(pt)$ worst case.Boyer–Moore does
$O(t)$ worst case. -
Primality testing: given a number, is it prime or not?
For a long time it was an open question if a polynomial algorithm existed.
Note that the naive algorithm of trying division by every number below
The question was solved positively in 2002 by the AKS primality test. Before AKS, there were many algorithms which were either probabilistic, or dependent on the truth of unproven conjectures, at the time of their proposal, such as the Miller test which depends on the generalized Riemann hypothesis.
Many variants of AKS were created, and one of them is known to run in
In practice however, faster probabilistic algorithms are used. GPG, PGP and OpenSSL use a mixture of:
- trial division by small primes
-
http://en.wikipedia.org/wiki/Fermat_primality_test,
$log^2(n)$ -
http://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test,
$log^3(n)$
Those algorithms used a known property of every prime that not many other numbers have, e.g. Fermat's primality test uses Fermat's little theorem.
http://en.wikipedia.org/wiki/Carmichael_number
Always passes Fermat's primality tests for any choice of random number.
Carmichael numbers are very rare, making Fermat's primality test acceptable:
- infinity conjectured earlier, but only proved in 1994
- bounds exist for every range, including the 1994 lower bound that proved infinity
- asymptotic distribution is open but conjectured
Korselt's criterion (1899) gives an iff condition for a number
- square free
- all prime divisors
$p$ of$n$ hold$p - 1$ |$n - 1$
-
Unknown sup of lower bounds.
-
Naive:
$n^2$ . -
Karatsuba: http://en.wikipedia.org/wiki/Karatsuba_algorithm
Multiplication in
$O(n^{\log_{2}3})$ . -
Tom-cook.
Generalization of Karatsuba.
http://en.wikipedia.org/wiki/Toom%E2%80%93Cook_multiplication
Algorithm family, hard to analyse, Toom-3 does
$O(n^{\log_{3}5})$ , about$O(n^{1.465})$ -
Schonhage–Strassen
$O(log n log log n)$ FFT based.
http://en.wikipedia.org/wiki/Integer_factorization
Many operations can be done more efficiently in memory and time if we only want the mod output:
A operation B (mod C)
Such operations come up often in cryptography.
Examples:
-
Square root
May not exist.
Existence criteria given at: http://en.wikipedia.org/wiki/Quadratic_reciprocity
Efficient algorithm given existence: http://en.wikipedia.org/wiki/Tonelli%E2%80%93Shanks_algorithm
http://en.wikipedia.org/wiki/Matrix_multiplication#Algorithms_for_efficient_matrix_multiplication
Is there an
This is practically important because matrix multiplications are done millions and millions of times to approximate solutions to physical problems.
Naive school algorithm:
First improvement: 1969 Strassen. Easy to understand, and better in practice for large enough matrices.
Current best: 1987 Volker-Strassen. More complicated and not yet useful on practical matrix sizes found on current problems.
These algorithms / problems may require analysis background to understand.
They are fun and important to implement solutions using computers.
-
Exponential.
-
Integration and derivation.
-
Taylor series.
-
Solve
$f(x) = 0$ :- divide by two
- newtons method
-
Matrices.
-
FFT.
-
Differential equations: ordinary/partial.
-
Finite elements.
https://en.wikipedia.org/wiki/Hilbert%27s_tenth_problem
Given an integer Diophantine equation
Famously proposed as an important problem in 1900, last step of the undecidability proof in 1970.
Interesting subset problems include:
-
Fermat's last problems decides negatively a small subset of Diophantine equations of the form
$x^n + y^n = z^n$ . -
limiting maximum degree:
-
1: efficient solution
-
2: there is an algorithm: http://math.stackexchange.com/questions/181380/second-degree-diophantine-equations/181384#comment418090_181384, but not efficient.
-
3: open
-
4: equivalent to the general problem of degree
$n$ , so undecidable
-
If a set is defined by a system of Diophantine equations, it can also be defined by a system of Diophantine equations in only 9 variables (Matiyasevich 1999).
http://en.wikipedia.org/wiki/Diophantine_quintuple
In a game of chess, if both player play perfectly, does the first player always win? Always loses? Always draws? Open as of 2013.
Is there a polynomial algorithm that allows to chose the next perfect move? Or is brute force necessary (almost never polynomial for interesting games).
Some games have been weakly solved: it have been proven that one player always wins / loses, bit the actual strategy is unknown.
Generalization for non perfect information games: strategy that leads to greatest expected outcome.
Non-mathematical question: if it is not polynomially decidable, do the best computers today beat the best humans?
List of solved games: https://en.wikipedia.org/wiki/Solved_game
2013
Humans win.
2013
Humans tie.
Draw.
Good programming exercise!
tic tac toe is an specific instance of an m-n-k game.
Generalizes tic tac toe and simple gomoku: https://en.wikipedia.org/wiki/M,n,k-game
Second player cannot win with perfect play.
Many win / lose results exist in function of m, n and k.
First player forces win in at most 41 moves.
Solved in 1988, without computer brute force.
Solved in 2007 with extensive number crunching.
https://www.newscientist.com/article/dn12296-checkers-solved-after-years-of-number-crunching/
https://en.wikipedia.org/wiki/Chinook_(draughts_player)
-
TODO poker? other card games. Could not find.
-
Guess Who http://arxiv.org/abs/1509.03327
-
Randomness can be seen as incomplete information:
- Cards in other player's hands: incomplete information that one player does not have
- Cards left in deck: incomplete information that no player has
- Next dice rolls: incomplete information that no player has. Full description is an infinite sequence of rolls if the game does not have a turn limit.
http://en.wikipedia.org/wiki/Euclidean_shortest_path 2D: P. 3D: NP-hard.
http://en.wikipedia.org/wiki/Closest_pair_of_points_problem
http://en.wikipedia.org/wiki/Convex_hull_algorithms
Leads to huge finite existence bounds like: https://en.wikipedia.org/wiki/Graham's_number