JavaAndroidPractise/Euler/mathematica/p106.mathematica at java_-basics · Tecmax/JavaAndroidPractise

138 lines (135 loc) · 7.32 KB
 * Solution to Project Euler problem 106
 * Copyright (c) Project Nayuki. All rights reserved.
 * https://www.nayuki.io/page/project-euler-solutions
 * https://github.com/nayuki/Project-Euler-solutions
 * Lemma to confirm denominator:
 *   For each natural number n >= 2, any set of size n has exactly
 *   (3^n + 1) / 2 - 2^n unordered pairs of non-empty disjoint subsets.
 *   0. Let A be an arbitrary set of size n. We want to count its subset pairs.
 *   1. Suppose we can label each element of A with a letter from the set {B, C, N}.
 *      A label is defined as an n-tuple of letters from the set {B, C, N}.
 *      For example, the set {a,b,c,d} can be labeled as (B,B,C,N) or (N,C,N,N).
 *   2. Let L be the set of all possible labels on A. We know that |L| = 3^n.
 *   3. If a label doesn't contain any B's, then it is equivalent to saying that the
 *      label is composed of any number of C's and N's, so there are 2^n of them.
 *   4. If a label doesn't contain any C's, then it is equivalent to saying that the
 *      label is composed of any number of B's and N's, so there are 2^n of them.
 *   5. If a label doesn't contain any B's or C's, then it is the singleton of all N's.
 *   6. How many labels contain at least one B and at least one C?
 *      Using the inclusion-exclusion principle, we have:
 *      |Have B and Have C| = |All labels| - |Lacks B or Lacks C|
 *          = |All labels| - (|Lacks B| + |Lacks C| - |Lacks B and Lacks C|)
 *          = |All labels| - |Lacks B| - |Lacks C| + |Lacks B and Lacks C|
 *          = 3^n - 2^n - 2^n + 1.
 *   7. For an arbitrary label that has at least one B and at least one C,
 *      what actually matters is which elements of A are put into one subset
 *      and which elements of A are put into the other disjoint subset, but
 *      the names of these subsets as B and C are interchangeable. Hence we
 *      divide by 2 to remove this degree of freedom, giving us a final count
 *      of (3^n - 2^n - 2^n + 1) / 2 = (3^n + 1) / 2 - 2^n.
 * Corollary:
 *   This confirms some values mentioned in the problem statement:
 *   - Set size n = 7 has 966 subset pairs.
 *   - Set size n = 12 has 261625 subset pairs.
 * Main theorem:
 *   Let A be an arbitrary set such that all of the following hold:
 *   - Its size is n.
 *   - It consists only of positive integers.
 *   - It satisfies the property (ii) given in the problem statement.
 *   - a0 < a1 < ... < a_{n-1} are all the elements of A listed in ascending order.
 *   To verify property (i), we would need to test some number of unordered pairs
 *   of non-empty disjoint subsets of A to see that the subsets have unequal sums
 *   Then we claim that exactly this many pairs need to be tested for equality:
 *   The summation of (n choose 2k) * [(2k choose k) / 2 - (2k choose k) / (k + 1)]
 *   for k from 2 to floor(n / 2) (inclusive).
 *   We begin by arguing about what subset pairs don't need to be tested,
 *   then progress to counting which subset pairs do and don't need to be tested.
 *   Let B and C be an arbitrary pair of non-empty disjoint subsets of A.
 *   Furthermore, restrict the pair  (and prevent duplicate counting) so that
 *   the smallest element of B is smaller than the smallest element of C.
 *   Assume that for each element of B and C, we know what index of A
 *   it comes from, but we don't look at the actual element's value.
 *   If sets B and C have different sizes, then property (ii) implies that
 *   either S(B) < S(C) or S(B) > S(C), which in both cases imply S(B) != S(C).
 *   Hence we only care about the cases where |B| = |C|.
 *   If |B| = |C| = 1, then we know by disjointness that S(B) != S(C).
 *   (Namely because each set has a different singleton element.)
 *   For the interesting case, we have |B| = |C| >= 2. If we can match each
 *   element of B to a unique larger element in C, then we know for sure that
 *   S(B) < S(C), which further implies that S(B) != S(C). When we find such
 *   a matching, the ordering of the elements of A already implies inequality,
 *   without the need to examine the actual element values.
 *   To illustrate with a concrete example, suppose B = {a0,a1} and C = {a2,a3}
 *   are subsets of some set A. For each element of the set B or C, we are
 *   assumed to know what index of A the element came from. Because we know
 *   a0 < a2 and a1 < a3, we add these inequalities to get S(B) = a0 + a1
 *   < a2 + a3 = S(C), implying S(B) != S(C). Similarly, with B = {a0,a2} and
 *   C = {a1,a3}, we have a0 < a1 and a2 < a3, leading to a0 + a2 < a1 + a3.
 *   But in the case of B = {a0,a3} and C = {a1,a2}, we cannot conclude
 *   whether S(B) equals S(C) without examining the actual element values.
 *   Let's imagine scanning the elements of A in ascending order. When we
 *   encounter an element a_i that is:
 *   - In B, then we push it onto a stack, remembering that we need to
 *     pair it with a later (and thus larger) element that is in C.
 *   - In C, then we consult the stack. If the stack is empty, then this
 *     element cannot be paired with an earlier (and thus smaller) element that
 *     is in B, so no match exists. Otherwise we pop one element from the stack.
 *   - Not in B or C, then we ignore it because it plays no role in the sums.
 *   Suppose we lay out the elements of A as a sequence, and label each element
 *   according to the subsets B and C in a particular way. If element a_i is in B,
 *   then label it as ( (left parenthesis). If element a_i is in C, then label it
 *   as ) (right parenthesis). Otherwise label it as nothing/space.
 *   To illustrate, suppose A has size 5, B = {a0,a1}, and C = {a2,a3}.
 *   Then this pair of subsets corresponds with the label "(( ))" on A.
 *   We can see that a pair of subsets doesn't need to be tested if
 *   its label corresponds to a string of balanced parentheses. This is
 *   a well-known combinatorics problem (which I won't try to prove):
 *   The Catalan number C_k = (2k choose k) / (k + 1) represents the
 *   number of ways that k pairs of parentheses can be arranged in a
 *   sequence to produce a proper expression (i.e. no prefix contains
 *   more right parentheses than left parentheses).
 *   There are (n choose 2k) ways to choose elements from A that will
 *   then be split among the subsets B and C. Focus on one arbitrary choice.
 *   Subsequently, there are (2k choose k) / 2 ways to put k of those
 *   elements into B and k of those elements into C, but without
 *   regards to the ordering of B and C.
 *   However, C_k = (2k choose k) / (k + 1) of those choices of subset
 *   pairs will necessarily have unequal sums due to the ordering of
 *   elements, hence they don't need to be tested.
 *   This means that for a given k >= 2, we need to test (n choose 2k)
 *   * ((2k choose k) / 2 - (2k choose k) / (k + 1)) subset pairs.
 *   Finally, we sum this term for k = 2, 3, 4, ... as long as 2k <= n,
 *   so that we consider all the sizes of k where we can take a pair
 *   of k-sized disjoint subsets of A.
 * Corollary:
 *   Although the derivation/justification is long, the amount of
 *   arithmetic is small enough to be doable by hand calculation.
setSize = 12;
CatalanNum[n_] := Binomial[n * 2, n] / (n + 1)
Sum[Binomial[setSize, i * 2] * (Binomial[i * 2, i] / 2 - CatalanNum[i]),
  {i, 2, Floor[setSize / 2]}]
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FilesExpand file tree

p106.mathematica

Latest commit

History

p106.mathematica

File metadata and controls