Leetcode753. Cracking the safe

Posted on 2018-08-30 | In Leetcode |

This is a programming problem on leetcode, and the interesting part is, it can be solved with two different points of
view. We're going to analyze the problem, prove some properties and lemma before we use them, at last write them into code.
The problem is as below:

Leetcode.753. Cracking the Safe

There is a box protected by a password. The password is n digits, where each letter can be one of the first k digits 0, 1, ..., k-1.

You can keep inputting the password, the password will automatically be matched against the last n digits entered.

For example, assuming the password is "345", I can open it when I type "012345", but I enter a total of 6 digits.

Please return any string of minimum length that is guaranteed to open the box after the entire string is inputted.

Approach #1: Graph Theory.

It's easy to take this problem as a graph theory, Let's take any number with n-1 digits as a vertex, to form a single and unique trial of the password, we extend a edge from this node, forming a n digits number. It also leads to the new node that was indicated by it's [1:n] digits. So this problem is turned in to a graph problem defined as below:

Given a directed graph with $k^{n-1}$ nodes, each nodes has $k$ edges, pointed to another node. Find a path that visits every edges exactly once.

Sounds familiar, right? This is a very classic question in graph theory, called Eulerian path. By definition, a Eulerian path is a trail in a finite graph which visits every edges exactly once. Eulerian cycle is an Eulerian path that starts and ends at the same vertex. Eulerian path and Eulerian cycle have following properties:

An undirected graph has an Eulerian cycle if and only if every vertex has even degree, and all of its vertices with nonzero degree belong to a single connected component.
A directed graph has an Eulerian cycle if and only if every vertex has equal in degree and out degree, and all of its vertices with nonzero degree belong to a single strongly connected component.

Both properties are easy to prove: For the first property, for each node, it must have an edge in and an edge out, to form a Eulerian path. For the second property, it's still the same: for each node, one in, one out.

To solve this problem, we have a theory called Hierholzer's Algorithm. The algorithm goes as follows:

Starting from a vertex u, we walk through (unvisited) edges until we get stuck. Because the in-degrees and out-degrees of each node are equal, we can only get stuck at u, which forms a cycle.
Now, for any node v we had visited that has unvisited edges, we start a new cycle from v with the same procedure as above, and then merge the cycles together to form a new cycle.

An figure below may helps you to understand the procedure.

Now the problem seems easy to solve. All we need to do is track every edge and mark them as visited. To prevent us from getting stuck (So we don't need extra memory to store unvisited edges), we start from 000...0, and go to the next node from k-1 to 0. The code is as followed:

java

class Solution {
    Set<String> seen;
    StringBuilder ans;

    public String crackSafe(int n, int k) {
        if (n == 1 && k == 1) return "0";
        seen = new HashSet();
        ans = new StringBuilder();

        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < n-1; ++i)
            sb.append("0");
        String start = sb.toString();

        dfs(start, k);
        ans.append(start);
        return new String(ans);
    }

    public void dfs(String node, int k) {
        for (int x = 0; x < k; ++x) {
            String nei = node + x;
            if (!seen.contains(nei)) {
                seen.add(nei);
                dfs(nei.substring(1), k);
                ans.append(x);
            }
        }
    }
}

Approach #2: Inverse Burrows-Wheeler Transform

If we are familiar with the theory of combinatorics on words, recall that a Lyndon Word L is a word that is the unique minimum of it's rotations.

One important mathematical result (due to Fredericksen and Maiorana), is that the concatenation in lexicographic order of Lyndon words with length dividing n, forms a de Bruijin sequence: a sequence where every every word (from the $k^n$ available) appears as a substring of length n (where we are allowed to wrap around.)

For example, when n = 6, k = 2, all the Lyndon words with length dividing n in lexicographic order are (spaces for convenience): 0 000001 000011 000101 000111 001 001011 001101 001111 01 010111 011 011111 1. It turns out this is the smallest de Bruijin sequence.

We can construct a de Bruijin sequence by Inverse Burrows-Wheeler Transform. Mathematically, an inverse Burrows—Wheeler transform on a word w generates a multi-set of equivalence classes consisting of strings and their rotations. These equivalence classes of strings each contain a Lyndon word as a unique minimum element, so the inverse Burrows—Wheeler transform can be considered to generate a set of Lyndon words. It can be shown that if we perform the inverse Burrows—Wheeler transform on a word w consisting of the size-k alphabet repeated $k^{n-1}$ times (so that it will produce a word the same length as the desired de Bruijn sequence), then the result will be the set of all Lyndon words whose length divides n. It follows that arranging these Lyndon words in lexicographic order will yield a de Bruijn sequence B(k,n), and that this will be the first de Bruijn sequence in lexicographic order. The following method can be used to perform the inverse Burrows—Wheeler transform, using its standard permutation:

Sort the characters in the string w, yielding a new string w'
Position the string w' above the string w, and map each letter's position in w' to its position in w while preserving order. This process defines the standard permutation.
Write this permutation in cycle notation with the smallest position in each cycle first, and the cycles sorted in increasing order.
For each cycle, replace each number with the corresponding letter from string w' in that position.
Each cycle has now become a Lyndon word, and they are arranged in lexicographic order, so dropping the parentheses yields the first de Bruijn sequence.

For example, to construct the smallest B(2,4) de Bruijn sequence of length 24 = 16, repeat the alphabet (ab) 8 times yielding w=abababababababab. Sort the characters in w, yielding w'=aaaaaaaabbbbbbbb. Position w' above w as shown, and map each element in w' to the corresponding element in w by drawing a line. Number the columns as shown so we can read the cycles of the permutation:

Starting from the left, the cycles are: (1) (2 3 5 9) (4 7 13 10) (6 11) (8 15 14 12) (16).

Then, replacing each number by the corresponding letter in w' from that column yields: (a)(aaab)(aabb)(ab)(abbb)(b).

These are all of the Lyndon words whose length divides 4, in lexicographic order, so dropping the parentheses gives B(2,4) = aaaabaabbababbbb.

The code is as followed:

class Solution {
    public String crackSafe(int n, int k) {
        int M = (int) Math.pow(k, n-1);
        int[] P = new int[M * k];
        for (int i = 0; i < k; ++i)
            for (int q = 0; q < M; ++q)
                P[i*M + q] = q*k + i;

        StringBuilder ans = new StringBuilder();
        for (int i = 0; i < M*k; ++i) {
            int j = i;
            while (P[j] >= 0) {
                ans.append(String.valueOf(j / M));
                int v = P[j];
                P[j] = -1;
                j = v;
            }
        }

        for (int i = 0; i < n-1; ++i)
            ans.append("0");
        return new String(ans);
    }
}

The most attractive part is that, at the first look into this problem, this does not look like a graph problem or a description problem. However, we can use the method from two different aspect to help us understand the problem in different point of view. If we look deeper into it, we can even found that, description, text problems and graph theories are closely related to each other, which showed the beauty of math and computer science.

Dirichlet Process 1 - Bernoulli Process and Bayes Theorem

Posted on 2018-06-14 | In Math |

What is Dirichlet Process

Dirichlet Process is a concept in probability theory, in fact, they are a family of stochastic processes whose realizations (or observations) are probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables -- how likely it is that the random variables are distributed according to one or another particular distribution.

Why Dirichlet Process

In information search, NLP and many other fields, DP are widely used in topic models. Meanwhile, in the interpretability field, Dirichlet process has a very important property: It can be generalized into infinite dimensions, and keep itself still computable.

Before we started our introduction to Dirichlet Process, I'd like to start with Bernoulli Process and Beta Distribution, for they are the two-dimension entrance of the relatively complex Dirichlet Distribution and Dirichlet Process.

Bernoulli Process

Bernoulli Trail

$f(q)=\{ \begin{aligned} 0, && prob=q\\ 1, && otherwise \end{aligned}$

We can consider this trail as tossing a coin, but this coin can be unbalanced, which means there's a possibility of $q (0\leq q \leq 1)$ to be head up, rather than 50%. We take head as $X=1$ and tail as $X=0$.

And Bernoulli Process is repeating a series of independent Bernoulli trail. By repeating the Bernoulli trail n times, we record the counting of $X=0$ as k.

Now consider a situation, that we wanted to know how unbalanced the coin is, so we tossed it n times and counted the k, now how do we find the probability of getting a head (which means $X=0$).

Well we'd always hope we can toss it forever and at last $k/n$ would of course convergence to q. But in real life, in most cases we can not try it so many times. For instance, we can only toss the coin four times, and you get "head head tail tail". Well, here $k/n$ is absolutely unreliable, so we can just give it a guess, and that we can only say "it sounds more reasonable if q is a certain value in $[ 0,1]$". And we can not say "Yes, it is the q value."

Consider if we get {head, head, tail, tail} in our four trails, absolutely we'd say 0.5 would be a reasonable guess, and 0.2 or 0.8 would sound kind of unlikely to be, 0.05 and 0.95 would be totally a joke to most people.

So as guys who love math and computer science, we'd like to have some tools in math to describe this, like use a probability density function. Here we see another problem, we want to update this function based on the prior we already have if we keep trying and get new observations. Naturally, we'd feel Bayes theorem would fit this well, because Bayes can update probability with continuous observations, and every time we update all we need is the prior of the previous status. Now we describe this problem with a more formal language.

When tossing a coin, we get a random sample $X=(x_1, x_2, ..., x_n)$, now we want to estimate the reasonable value of q in $[0,1]$ based on these n observations. Since we can't use a single certain value to describe q, we use a distribution function to do the job: a probability density function about q. Definitely we'd write this into a conditional probability distribution: $f(q|X)$, because we're guessing q under the condition of observing X.

Now let's have a look a Bayes Theorem"

$P(q|x)P(x)=P(X=x|q)P(q)$

$P(q)$ is the prior probability about q, $P(q|x)$ is the posterior probability. Notice that we are saying "probability" rather than "probability density function". In order to combine Bayes theorem with probability density function, we can get $f(q)$ from $P(q)$, then get $f(q|x)$ from $P(q|X)$.

Here $P(x)$ is a constant value, so we can have a conclusion, that is The posterior probability density function $f(q|x)$ of p is proportional to it's prior $f(q|X)$ times a conditional probability, which is:

$f(q|x) \sim P(X=x|p)f(q)$

With this result, let's go back to the coin problem.

By tossing the coin n times, it would be a Bernoulli process, so when q is constant, the result of the n times tossing are certain,and we get an observation of k times tail, we can describe it as $P(X=x|p)=q^k(1-q)^{n-k}$.
So $f(q|x)$ is proportional to prior and the conditional probability above, which we say:

$f(q|x) \sim q^k(1-q)^{n-k}f(q)$

Now let's have a look at $f(q)$. Naturally, when we know nothing about the coin, we should consider the q can be any value in $[0,1]$. So we'd prefer to think $f(q)$ to be uniformly distributed in this interval, which is $f(q)=1$ for q in $[0,1]$ .

Now we can see in the formula $f(q|x) \sim q^k(1-q)^{n-k}f(q)$. $q^k(1-q)^{n-k}$ times a $[0,1]$ uniform distribution would result in a Beta Distribution.

What's next

In this chapter we discussed the Bernoulli Process and how it was combined with Bayes Theorem, it may still look a little bit far from Dirichlet Process, but they are very fundamental knowledge and with a little bit variation, we will get our Dirichlet.

In the next chapter, we'll discuss about Beta Distribution, talk a little bit about an important property called conjugate, which indeed made Dirichlet process so important.

Start a Smart Project with Github

Posted on 2018-05-11 | In Engineer |

GitHub is now widely used platform that helps people develop and distribute their project. Knowing how to use some interesting function of GitHub can make your developing process fluent and stable.

Issue and pull-request
Automatic test
Protected Master branch

Issue and Pull Request

When building a project, the developing history should be open and trackable. In order to do so, it is recommended to use some method to track every change and improvement. Use Issue to track every improvements that will do developers good favour.

Every issue have an index, which is marked by #, looks like #3.

Pull request is used for cooperation, GitHub provides many interesting functions with pull request, one of them is Close an issue with pull request. In order to do so, just make the description of your pull request look like this:

1
2
3

Message: Pull request from forked "xxx".

Description: Resolve #3.

"Resolve" is a token that will close an issue, so will "Fix" and "close" do. Check official doc for further information.

Automatic test

Every time you commit something to your project, you have to make sure that the whole project doesn't break down because of your commit contains bugs or something. Luckily, Travis-CI can do it and its also supported by GitHub, wichi means you can view if your test passed or not before admit a pull request.

To distribute the automatic test is very simple. All you need to do is add a .travis.yml file to the root directory of your project.

See official doc for more help.

Protected Master branch

Master branch is very important to your project, which should not be manipulated casually, it is recommended to add several protection checks when merge any commit to your master branch.

It is recommended to do as follows:

Then look! Your master branch is under protection!

Leetcode 17.Letter Combinations of a Phone Number

Posted on 2018-03-30 | In Leetcode |

Given a digit string, return all possible letter combinations that the number could represent.

A mapping of digit to letters (just like on the telephone buttons) is given below.

Input:Digit string "23"
Output: ["ad", "ae", "af", "bd", "be", "bf", "cd", "ce", "cf"].

Note:
Although the above answer is in lexicographical order, your answer could be in any order you want.

java

public List<String> letterCombinations(String digits) {
    LinkedList<String> ans = new LinkedList<>();
    if (0 == digits.length()) {
        return ans;
    }
    String[] mapping = new String[]{"0", "1", "abc", "def", "ghi", "jkl", "mno", "pqrs", "tuv", "wxyz"};
    ans.add("");
    for (int i = 0; i < digits.length(); i++) {
        int x = Character.getNumericValue(digits.charAt(i));
        while (ans.peek().length() == i) {
            String temp = ans.remove();
            for (Character c : mapping[x].toCharArray()) {
                ans.add(temp + c);
            }
        }
    }
    return ans;
}

This alogirthm uses a queque to make sure every new character was added to the previous string, shown as below.

Discrete Random Variables And Expectation 1

Posted on 2018-03-22 | In Math |

Random Variables and Expectation 1

Definition 2.1

A random variable $X$ on a sample space $\Omega$ is a real-valued (measurable) function on $\Omega$: that is, $\Omega \rightarrow \mathbb{R}$. A discrete random variable is a random variable that takes on only a finite or countably infinite number of values.

Since the random variables are functions, they are usually denoted by capital letters, while real numbers are usually denoted by lowercase letters.

Defination 2.2

Two random variables $X$ and $Y$ are independent if and only if

$Pr((X=x)\cap(Y=y))=Pr(X=x)\cdot Pr(Y=y)$

for all values x and y. Similarly, random variables $X_1$, $X_2$,.... are mutally independent if and only if, for any subset $I\subseteq[i,k]$ and any values $x_i$, $i\in I$,

$Pr\left(\bigcap_{i\in I}(X_i=x_i)\right)=\prod_{i \in I}Pr(X_i=x_i)$

Definition 2.3

The expectation of a discrete random variable X, denoted by $E[X]$ , is given by

$E[X]=\sum_iiPr(X=i)$

First Blog

Posted on 2018-03-09 |

Why do I start to write blogs?

Well, This blog is more like a personal notebook that helps myself to get a review to those things I liked and tried. Explaining something to others help me exam if I do really understand it or not, and as I have no cats that I can explain my work to so far (But I'll have a cat one day), I decide to start to write this blog and I'd be happy if you wanna discuss about anything with me.

What will this blog mainly be about?

Most time this blog would be for technical issues, scientific problems (computer science and math mostly) and software engineering problems, but there's still a possibility that I get crazy and write some stupid funny stuffs.

Oh, if I really get a cat, this blog will have a new chapter called CAT with cat photos.