Zhao's Notebook

Computer Science, Math, and >>>


  • Home

  • Archives

  • Categories

  • Tags

内容与形式:严肃的《炎拳》和《电锯人》 上

Posted on 2020-10-21 | In Other |

写在前面

如果你喜欢 Rick & Morty,却又没看过《炎拳》和《电锯人》,那么你应该关掉这个页面,赶紧去把这两部作品看完。

这年头上网,一天看不见一两个烂梗的概率可能比买彩票中头奖的概率还低。烂梗的生与死组成了现代互联网生活的一部分。有的烂梗来源于真正烂的生活,是梗让其升华;而有的烂梗则来自于真正有意义的思想,若就这样让烂梗自生自灭,就显得有些遗憾。“对他使用炎拳吧”,“马骑马,我的坏女人”,“好耶!”这几个烂梗都来自于藤本树(藤本たつき)之手,分别来自两本漫画《炎拳》和《电锯人》。乍一看,这两本漫画都充斥着已经被垃圾作品或经典作品用烂的一切要素:无端死人,迷之反转,毫无铺垫的超展开,从来没提过的突然插入的设定。再加上绝对说不上讨喜的画风和网络上常见的莫名其妙的玩梗,这两部作品很容易被人归类于“弱智小鬼为黑而黑为噱头而噱头的脑残作”。

但事实并非如此,尽管这两部作品的爱好者大多心照不宣地玩烂梗(而这也是这两部作品的主要思想的外在映射之一),我还是想要仔细的分析这两部作品严肃艺术的内核和它们所达到的高度。《炎拳》的严肃主要体现在于其哲学内核,而《电锯人》的严肃主要体现在其对“漫画”这一艺术形式在技巧上的拔高。

“活下去”

一言以蔽之曰,《炎拳》是一个末世超能力者大乱斗的故事。

单这样看的话,这部作品可以说是毫无新意,开头几章的内容也是土到掉渣:因为“冰之魔女”的严寒,缺乏能量供给的人类濒临灭绝的边缘。拥有再生能力的主角和他的妹妹切自己的肉为村庄提供食物,被路过征粮的王国中队长发现村庄吃人肉的秘密,中队长使用了“不燃至死亡就不会熄灭的火焰”将整个村庄焚为平地,只有主角因为强大的再生能力,无限地燃烧又无限地再生,带着无尽的痛苦踏上了向中队长复仇的道路。。。

够土了吧,在这个设定里出现的几个矛盾:气候变化与人类生存的矛盾,自我牺牲奉献和世俗道德的矛盾,两种互相拮抗的能力产生的肉体痛苦的矛盾,个人的复仇和强大对手的矛盾,都是少年漫画里稍微想要弄的黑深残一点的作品常用的几个套路。如果《炎拳》止步于此,它可能只能算一个垃圾作品,根本不值得有任何深入的讨论;把这么多的套路一口气全部堆在一起,不是垃圾就是另有图谋:《炎拳》则恰恰是对这些套路另有图谋的那种作品。《炎拳》前半部的情节发展,都在质疑这一个问题:

这些设定,真的有意义吗?

少年漫画里主角活动的唯一目的,就是解决矛盾。常见的设定里,气候变化是由“冰之魔女”引起的,所以人类一直想着要去打倒“幕后大boss”,夺回温暖的地球。可是事实上“冰之魔女”并不存在,人类对“幕后boss”的抗争毫无意义;牺牲自己拯救大家和人人生而平等的矛盾在“随便你怎么做人类肯定是要灭亡了大家死了就都都平等了”面前也毫无意义;肉体燃烧的痛苦更是在正式开始复仇旅程之后干脆就没出现过了;主角从开始走路就强的逆天,没有挫折没有失败的可能,哪里来的强大对手,说龙傲天就龙傲天,把“挫折——成长——再挑战”的少年漫画精髓嘲笑了个够;大反派原来根本就不是反派,你的复仇没有意义;大反派的洗白也没意义,因为主角还是顺手把他杀了,洗白本身也没有意义。整个前半部,《炎拳》就在解释:所有的矛盾,不论大小,全都没有意义,全都不能构成主角活动的动力。这些东西全部加在一起,都不如电影导演利贺田想拍好电影来的有意义:如果你觉得利贺田拍电影很荒诞,你就更应该觉得主角所做的一切事情都无比荒诞。

这也是为什么在杀死大反派之后,主角选择了自杀:倘若生活没有意义,活着又是为了什么?

从虚到实

活着对身体不好,让她赶快去死

前半部《炎拳》在漫画里否定了主角活着的意义,导致了主角的自杀,但是也在借利贺田之口,向所有读者问了同样的问题:你的生活,就又意义吗?

“真正严肃的哲学问题只有一个,就是自杀”。

有多少人真正认真的问过自己这个问题:我究竟为什么不自杀?当然这个问题显得消极,容易让操心的老妈担惊受怕甚至直接去给你找个心理医生,我们换一个完全等同的问题:

我究竟是为什么而活?

《炎拳》的主角走路,杀人,救人,杀人,救人,杀人,救人,然后突然醒悟:这一切都没有意义。我们呢?起床,吃饭,上四小时班,吃饭,再上四小时班,再吃饭,上网或者看电视,周一周二周三周四周五周六周日,过一些纪念日,买一些礼物,庆祝一些活动,周而复始,日复一日。突然有一天,嘴角不自觉地发出一声嗤笑,“为什么?”。这个问题暴烈地闯入你的思想,

很难想象人为了任何一种目的而产生的。为了方便理解,我们可以先思考剪刀:剪刀是有其意义的:它是为了剪东西而被创造出来的,如果不为了剪东西这一目的,很难想象出为什么会有人创造出长成这样的一个物品。而石头则不为任何目的而被创造出来:它存在就存在了,并没有任何的意义和目的。

人呢?人更接近剪刀,还是更接近石头?

存在主义的观点认为,人更接近石头。很难想象人生来就被赋予某种义务或责任,对一个刚出生地婴儿施加的任何一点期望都显得滑稽可笑:不论是最虔诚的神父对这婴儿祝福“你生来就是为了拯救世人”还是最热忱的革命者对婴儿断言“你生来就是为了全人类的解放而奋斗”,想想这个场景,除了滑稽还有什么?或者换一种方式:是否有这样一件事,当你完成了它之后,你便可以立即死去而不会有任何犹豫和遗憾?我们认为:没有;即使有人会讲“朝闻道夕死足矣”,也没有谁会真的“闻道”之后立即自杀。因此我们得到一个结论:金钱,权利,宗教,政治理想,科研,它们统统是生命的一种选择,但绝不是生命与生俱来的目的。假使一个人未经自己的思考和挣扎,仅仅因为其生来就觉得这理所当然,就为了任何一种世俗的东西而放弃生命,它表现为宗教的自杀式袭击,工作到把自己送进ICU,和领导喝酒回家路上一头栽进河里,单单是想象都觉得无比滑稽可笑。

在向外寻找人生的目的失败后,我们只能向内寻找:我们能否认为,活着本身就是活着的目的呢?

我们每天用宝贵的生命读书,上班,挣钱,想着办法算卡路里,运动,健康饮食,交着用宝贵的生命换来的钱给医保和医生,就为了换来更长的生命。这一切构成了一个愚蠢的交易:用自己的生命换自己的生命。电影导演简单的一句反驳:“活着对身体也不好,建议赶紧去死”。揭露了这一想法的荒谬的本质:我们不能认为活着就是活着的目的,因为如是如此,活着本身就会消耗活着,一个工具本身就会损伤目的的话,我们最好的方法就是抛弃这一工具;但是此时抛弃工具就是抛弃目的,所以人是无法在这一命题中得到自洽的结果的。

《炎拳》用更隐晦,通俗的方式探讨了同样的问题:我们一直坚信的活着的意义,它真的存在吗?我们被剥夺了所有的意义之后,又该何去何从?

《炎拳》并没有像《局外人》和《禁闭》一样,用严肃文学的形式讨论严肃的哲学思考。相反的是,《炎拳》从头到尾都充斥着娱乐元素。设定上是末世的超能力者大乱斗,可以说是烂透天际的题材;表现形式上则是恶搞漫画和B级片的融合,也是娱乐至死的到甚至会有些倒胃口的选择。这样的选择使得《炎拳》本身比一般的文艺作品更通俗,更娱乐,也就更容易被人接受:这一点从铺天盖地的炎拳烂梗就能看出来;坏处是,批了这样的一层皮之后,在内核的讨论上很难达到足够的高度。

人为什么不自杀

在主角领悟作为一本漫画的主角,他原本所相信的所有存在的动力都是毫无意义的之后,自然而然的,主角陷入了一个这样一个的境地,他不得不做一个决定:是自杀以终结无意义的一切,还是无意义地继续活着。

主角第一次认识到作为漫画主角却无意义的荒谬的时候,选择了物理上的自杀,“即使是一个野蛮人也能做出的懦夫的决定”。漫画中阻止主角自杀的原因非常的简单也非常的没有道理:只要有人在主角寻死时阻止他,然后对他说一句“活下去”,主角就不再自杀了。在生活中,能阻止我们物理自杀的东西也差不多:单单是对生的依恋,就足够超越世间所能遇到的一切苦难。世上的绝大多数人都不自杀是很难找出理由的,只能认为即使不存在动机,人的存在也天生就反抗要将自己消灭的力量。更何况自杀并没有解决人生没有意义的问题,单纯只是逃避了思考和解决这个问题而已。

除了物理上的自杀,还有精神上的自杀。藤本树在《炎拳》里,将读者放在全知的视角,再让主角给自己赋予一个不切实际的希望,来讽刺所谓希望对于解决人生荒谬这件事情毫无意义。后半段中漫画中的主角欺骗自己,总有办法可以复活妹妹,或者杀了已经死去的仇人,这些漫画中习以为常而不加质疑的目标,读者作为上帝视角看在眼里,当然知道这些都是不可能的,无论如何这些希望都会最终背叛主角。

假如我们的人生也有这样的读者,作为局外人来看待我们的生活呢?我们对生活的最深切的希望,就是健康的活下去,但是这是必定会被背叛的:人终有一死。那么其他建立在活着的基础上的希望也一定会被背叛。因此,我们对自己的生活的掌握是不能建立在这些不切实际的希望上的,这就是生活的荒谬。相反,我们清楚的认识到:人生本来就毫无意义,这便是我们能掌握的唯一对抗荒谬的希望。

如何理解这一点:想象一个教士向你解释,说在世间行的善,死后便会上天堂,你会不由得认为:我所掌握的,理解的,远比教士更多:他虽在天堂希望了很多,但可惜那就是虚假的;我虽然在尘世希望的很少,但样样为真。同样,生活中那些不切实际的希望,就像教士所说的“死后的天堂”,虽然很多,却并不真实;但是认清生活本身就毫无意义,单这一点,却因为其真实而远比“教士们”所掌握的更多了。

卡缪,萨特,陀思妥耶夫斯基,雨果,诸多聪慧的思想都遇到过这个问题:人活着没意义,但我也不想死,怎么办?既然他们也都没有得出一个能让自己真正信服的答案,我们也不必苛求自己。因为我们虽掌握的少,却也确实掌握了一些东西。而且这些思考最终所指向的结论也不会令人痛苦:

我们最终得到了这样一个结论:生活本身没有意义,但是那也还好,并不太糟糕。花还是红,草还是绿,游戏还是一样的好玩。不再为生活寻找“意义”,而是为自己做选择,我们才能活得更好,生活才能真正的属于自己。不再为“这个工作挣钱多”而工作,而是“我喜欢干这件事”而选择做一份工作。不再为“人应该结婚生子”而仓促走上一条单行道,而要因“我喜欢组建一个家庭”而作出选择。因理性的思考而抛弃天生的,强加的,无意识的人生的意义,人的全部意义都在人为自己所做的选择。

我们乐观地认为:人皆有一死,但是我们还是选择继续生活。在享受为自己的生活做选择的过程中,我们表达了对生命荒谬的反抗,并拥抱了真正的自由。

KickStart 2019 Round D Question B. Latest Guests

Posted on 2019-08-08 | In Computer Science |

Problem Link

See here.

Problem

The city of Circleburg has a large circular street with N consulates along it. The consulates are numbered 1, 2, ..., N in clockwise order.

Today G guests, numbered 1, 2, ..., G will drive along the circular street for M minutes. Each guest is either a clockwise guest (denoted by the character C) or an anti-clockwise guest (denoted by the character A).

The i-th guest starts at the consulate numbered and at the end of each minute will drive to an adjacent consulate. The i-th guest starts at the j-th consulate. If that guest is:

  • a clockwise guest, they will drive to the (j+1)-th consulate (unless they are at the N-th consulate, then they will drive to the 1st consulate).
  • an anti-clockwise guest, they will drive to the (j-1)-th consulate (unless they are at the 1st consulate, then they will drive to the N-th consulate).

Each consulate will only remember the guest that visited them last. If there are multiple guests who visited last, then the consulate will remember all of those guests.

For each guest, determine how many consulates will remember them.

Input

The first line of the input gives the number of test cases, T. T test cases follow. Each testcase begins with a line containing the three integers N, G and M, which are the number of consulates, the number of guests and the number of minutes respectively. Then, G lines follow. The i-th line contains the integer followed by a single character; C if the i-th guest is a clockwise guest or A if the i-th guest is an anti-clockwise guest.

Output

For each test case, output one line containing Case #x: y1 y2 ... yG, where x is the test case number (starting from 1) and yi is the number of consulates that remember the i-th guest.

Sample input and output

Input

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
4
5 3 2
5 C
2 A
1 A
2 4 0
1 A
1 C
1 A
1 C
3 2 10
3 C
2 A
6 1 6
4 A

Output

1
2
3
4
Case #1: 2 2 1
Case #2: 1 1 1 1
Case #3: 2 2
Case #4: 6

Solution

Instead of testing you with the knowledge of smart algorithms or tricky skills, this problem is more of an engineer skill testing: there are too many corner cases that requires consideration. Thus, how to write the code so that it could be maintainable would be a real challenge.

In order to do so, I'll try to give a vague explanation of the methods we used to solve this problems, and all other details would remain in the sample code, and I wish the readers can understand it by reading the code.

Analysis

Let's consider the clockwise guests only at the beginning. For any guest, finishing many circles would result in the same thing as finishing at most one circle, thus m could be reduced to a much smaller number.

Then let's look at the problem from the consulates' point of view. Each consulates only tracks it's latest visited guest, and to track that information, it's just natural to think about using Queue (linkedlist). Here's what we do: we keep a sliding window of length m starting at 0, and add each guests to queue from the furthest to closest anti-clockwise (for clockwise guests). And in each move, we let the sliding window move one step forward clockwise, and add the new guest in the new position into the queue if it exists; Also we poll out the guest who stands in the position which just left the window in this move, also only when such guest exists.

In each move, the one that can be remembered by a consulate is the one that is the furthest from it, which is just the tail of the queue when the queue is not empty (things are a little bit different for anti-clockwise). And be sure to only remember the furthest one between the clockwise candidate and the anti-clockwise candidate. If they're of the same distance, remember both.

Furthermore, the guests starting at the same positions would behave totally the same, thus we can group them together and track the group behavior only, and assign the results to each member in a group when the calculation is finished. Consider the situation that both n and g are large, but all g starts at the same position. If we track each guests independently, we'll need time just for recording. If we track the group, we'll only need .

Code

There're too many corner cases, so I left every detail in the code, hope you can get it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.*;

public class Solution {
public static void main(String[] args) {
Solution solution = new Solution();
Scanner in = new Scanner(new BufferedReader(new InputStreamReader(System.in)));
int T = in.nextInt(); // Scanner has functions to read ints, longs, strings, chars, etc.
for (int t = 1; t <= T; ++t) {
int n = in.nextInt();
int g = in.nextInt();
int m = in.nextInt();
int[] dirs = new int[g];
int[] poses = new int[g];
in.nextLine();
for (int i=0; i<g; i++) {
String[] s = in.nextLine().split(" ");
// Reset the index from 1-based to 0-based
poses[i] = Integer.parseInt(s[0])-1;
dirs[i] = (s[1].equals("C"))? 1: -1;
}
int[] ans = solution.solve(n, g, m, dirs, poses);
System.out.print("Case #" + t + ":");
for (int i: ans) {
System.out.print(" " + i);
}
System.out.println("");
}
}

private int[] solve(int n, int g, int m, int[] dirs, int[] poses) {
if (m > n) {
m = (m % n) + n;
}
int[] ans = new int[g];
int[] clockCounts = new int[n];
int[] antiCounts = new int[n];

boolean[] clockGuests = new boolean[n];
boolean[] antiGuests = new boolean[n];

// Group quests with same starting positions.
initializeGuestsArray(clockGuests, antiGuests, g, dirs, poses, ans);

// Travel each consulate in a clockwise order
LinkedList<Integer> queueClock = new LinkedList<>();
LinkedList<Integer> queueAnti = new LinkedList<>();
int clockTail = -m;
while (clockTail < 0) {
clockTail += n;
}
int antiHead = m % n;
int antiTail = 0;
for (int i=0; i<n; i++) {
if (i == 0) {
initializeWindow(queueClock, clockGuests, n, m);
initializeAntiWindow(queueAnti, antiGuests, n, m);
} else {
// Move window
if (!queueClock.isEmpty() && queueClock.peek() == clockTail) {
queueClock.poll();
}
if (clockGuests[i]) {
queueClock.offer(i);
}
clockTail++;
if (clockTail == n) {
clockTail = 0;
}

// Anti window moving
if (!queueAnti.isEmpty() && queueAnti.peek() == antiTail) {
queueAnti.poll();
}
antiHead++;
if (antiHead == n) {
antiHead = 0;
}
if (antiGuests[antiHead]) {
queueAnti.offer(antiHead);
}
antiTail++;
if (antiTail == n) {
antiTail = 0;
}
}
// Record the result of this step.
recordStepResult(queueClock, queueAnti, i, n, m, clockCounts, antiCounts);
}

// Get answers
for (int i=0; i<g; i++) {
if (ans[i] < 0) {
ans[i] = antiCounts[-(ans[i]+1)];
} else {
ans[i] = clockCounts[ans[i]];
}
}
return ans;
}

private void recordStepResult(LinkedList<Integer> queueClock, LinkedList<Integer> queueAnti,
int i, int n, int m, int[] clockCounts, int[] antiCounts) {
int clockDis = -1, antiDis = -1;
if (!queueClock.isEmpty()) {
clockDis = i - queueClock.peek();
// If one quest visited it multiple times,
// the distance would be within m and as large as possible
while (clockDis + n <= m) {
clockDis += n;
}
}
if (!queueAnti.isEmpty()) {
antiDis = queueAnti.peekLast() - i;
if (antiDis < 0) {
antiDis += n;
}
// If one quest visited it multiple times,
// the distance would be within m and as large as possible
while (antiDis + n <= m) {
antiDis += n;
}
}
int dis = Math.max(antiDis, clockDis);
if (dis != -1) {
if (dis == clockDis) {
addIntoAnswer(clockCounts, queueClock.peek());
}
if (dis == antiDis) {
addIntoAnswer(antiCounts, queueAnti.peekLast());
}
}
}

private void initializeAntiWindow(LinkedList<Integer> queueAnti, boolean[] antiGuests, int n, int m) {
for (int i=0; i<=m; i++) {
int pos = (i >= n)? i % n: i;
if (antiGuests[pos]) {
queueAnti.offer(pos);
}
}
}


private void initializeWindow(LinkedList<Integer> queue, boolean[] clockGuests, int n, int m) {
int pos = 0 - m;
while (pos < 0) {
pos += n;
}
while (m-- >= 0) {
if (clockGuests[pos]) {
queue.offer(pos);
}
pos++;
if (pos == n) {
pos = 0;
}
}
}

private void initializeGuestsArray(boolean[] clockGuests, boolean[] antiGuests,
int g, int[] dirs, int[] poses, int[] ans) {
for (int i=0; i<g; i++) {
int dir = dirs[i], pos = poses[i];
if (dir == -1) {
antiGuests[pos] = true;
ans[i] = -pos - 1;
} else {
clockGuests[pos] = true;
ans[i] = pos;
}
}
}

private void addIntoAnswer(int[] counts, int pos) {
counts[pos]++;
}
}

KickStart 2019 Round D Question A. X or What

Posted on 2019-08-01 | In Computer Science |

Problem Link

See here.

Problem

Steven has an array of N non-negative integers. The i-th integer (indexed starting from 0) in the array is Ai.

Steven really likes sub-intervals of A that are xor-even. Formally, a sub-interval of A is a pair of indices (L, R), denoting the elements . The xor-sum of this sub-interval is xor xor ... xor xor , where xor is the bitwise exclusive or.

A sub-interval is xor-even if its xor-sum has an even number of set bits in its binary representation.

Steven would like to make Q modifications to the array. The i-th modification changes the $P_i$-th (indexed from 0) element to $V_i$. Steven would like to know, what is the size of the xor-even sub-interval of A with the most elements after each modification?

Input

The first line of the input gives the number of test cases, T. T test cases follow.

Each test case starts with a line containing two integers N and Q, denoting the number of elements in Steven's array and the number of modifications, respectively.

The second line contains N integers. The i-th of them gives $A_i$ indicating the i-th integer in Steven's array.

Then, Q lines follow, describing the modifications. The i-th line contains $P_i$ and $V_i$, The i-th modification changes the $P_i$-th element to $V_i$. indicating that the i-th modification changes the $P_i$-th (indexed from 0) element to $V_i$.

Output

For each test case, output one line containing Case #x: y_1 y_2 ... y_Q, where x is the test case number (starting from 1) and y_i is the number of elements in the largest xor-even sub-interval of A after the i-th modification. If there are no xor-even sub-intervals, then output 0.

Sample Input and Output

Input

1
2
3
4
5
6
7
8
9
2
4 3
10 21 3 7
1 13
0 32
2 22
5 1
14 1 15 20 26
4 26

Output

1
2
Case #1: 4 3 4
Case #2: 4

Solution

This is a problem that we can tear it apart into two sub-problems, how to determine if a number has even or odd 1s in its bits, and how to find the longest subarray. Let's do it one by one.

How to determine if a number has even or odd 1s in its bits?

This one looks simple, while it can also be pretty tricky. Let's start from the most naive one.

Naive

Naively compare every bits and count them, see if there's even of them.

1
2
3
4
5
6
7
8
9
10
boolean evenOnes(int n) {
int count = 0;
while (n != 0) {
if ((n & 1) == 1) {
count++;
}
n >>= 1;
}
return (count & 1) == 0;
}

A small improve

We have noticed that this require $k$ times of comparisons, while $k$ denotes the number of bits of the integer. In Java 8, $k = 32$.

Can we improve it a little bit? Here a method that allow us to compare exactly $a$ times, while $a$ denotes the total number of 1 in $n$'s bits presentation. Here's how we do it:

1
2
3
4
5
6
7
8
boolean evenOnes(int n) {
int count = 0;
while (n!=0) {
n &= n-1;
count++;
}
return (count & 1) == 0;
}

How did this work? Take a look at the figure.

Figure 1

Say we split the binary representation of the integer n into two parts, b denotes from the first 1 to the lowest bit, and a for the rest.

For $n-1$, we can see that a part stayed the same and every bits get flipped in b part, thus the & operation between will keep everything in a the same, and turn every bits in b into 0. As a result, we erased the first 1 in $n$, from the lowest bit to highest. The loop terminates until $n = 0$, which indicates that all 1 have been erased from $n$. Counting the time of the loop, would be the number of 1s in n.

Can we do even better?

Yes we can, but to be strictly correct, we cannot actually call this better under all circumstance (Imagine if the n is 0, can we actually do it any better?). The fact is we can do this by do five comparison under all circumstance, thus for any $n$ that has more than five 1s, it's better, other than that, it's actually worse.

But we're presenting the method anyway, because, firstly, the method is tricky and interesting, and secondly, in practice, this method is stable, especially for input that has a lot of 1, it's much faster.

Here's the code, and it's pretty simple.

1
2
3
4
5
6
7
8
boolean evenOnes(int x) {
x ^= x >> 1;
x ^= x >> 2;
x ^= x >> 4;
x ^= x >> 8;
x ^= x >> 16;
return (x&1) == 0;
}

Now, here's the tricky part, let's start from the first comparison: x ^= x >> 1

Figure 2

See from the figure, and concentrate on the i-th bit. Under what circumstance would the i-th bit be 1 ? The answer is there's only one 1 at the i-th and (i+1)-th bit in x. Which means, if the result bit at the i-th bit is 1, that indicates there are odd number of 1s between the i-th and the (i+1)-th bits, and vice versa.

Now let's take a look at the second comparison: x ^= x >> 2;, which can be indicated by the figure below:

Figure 2

Now remember the x is the x after the first comparison, which means the i-th bit, a indicates whether there're odd number of 1 in the i-th and (i+1)-th bits in the original input, and for b, it indicates whether there're odd number of 1 in the (i+2)-th and (i+3)-th bits.

Now let's see what would happen after this comparison. Consider this: What would a and b like, if we want the i-th bit is 1 after the operation? The only answer is that, only one out of a and b is 1. Recall that a and b indicates the oddness of the original input on there representing range, and only one of them have odd 1s. So if we merge the two ranges, the new range would have odd 1s in it. Thus we could say, the bit on i-th position indicates the oddness of 1s from i-th position to (i+3)-th position in the original input.

The same things goes all the way to the range is length 32, the number of bits of an integer in Java 8. Then we can say: The bit at the 0-th position indicates the oddness of 1 for the range from 0 to 31, which is the total input.

Now we have solved our first problem, let's see the second.

Find the Longest Sub-array that Every Element Has Even 1's.

Here, Even means odd-even even, not an "equal" even.

The brute-force way, of course, it after each modification, go through all the sub-arrays and record the longest one. This is a $O(n^2)$ algorithms and too trivial that we don't actually need to discuss more about it.

In order to improve it, we have noticed an interesting fact: For any a & b = x, the x would have odd 1's if and only if one of a and b have odd 1's. Prove it as an exercise, it's not hard!

So here are two conditions: the number of odd-1's element in the array is even, and the number is odd. Things are pretty easy for the even situation: the longest sub-array is the whole array. For the odd situation, the longest sub-array is one of the following sub-arrays: the subarray right after the first odd-1's element to the end, and the subarray from the beginning to the element right before the last odd-1's element.

To avoid looking for the first and last odd-1's element after every modifications, we can use a sorted set to record the positions of them, thus we only need to add or remove one element to the set, or do nothing in each modification. The time complexity is $O(log(k))$ for each modification, where $k$ indicates the size of the set.

Here's an implementation of it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public List<Integer> solve(int n, int q, int[] nums, int[][] modis) {
List<Integer> ans = new ArrayList<>();
TreeSet<Integer> positions = new TreeSet<>();
for (int i=0; i<n; i++) {
if (!evenOnes(nums[i])) {
positions.add(i);
}
}

for (int[] modi: modis) {
int pos = modi[0];
int val = modi[1];
if (evenOnes(val)) {
if (positions.contains(pos)) {
positions.remove(pos);
}
} else {
if (!positions.contains(pos)) {
positions.add(pos);
}
}

if ((positions.size() & 1) == 0) {
ans.add(n);
} else {
ans.add(Math.max(n - positions.first() - 1, positions.last()));
}
}

return ans;
}

In total, the time complexity of the algorithm is $O((k+q)log(k))$, where $k$ denotes the size of the set, and $q$ is the number of queries.

It's also mentioned that van Emde Boas trees can solve this problem, but I'll leave it to another separate article to talk about it, for this one is already long enough.

ArrayBlockingQueue in Java

Posted on 2019-05-14 | In Engineer |

My recent phone interview encountered with an OOD question about multi-thread safe queue (and I'm a new grad).
I didn't do well so I read into the source code of ArrayBlockingQueue, which is a widely used multi-thread-safe queue class in openJDK.

For full source code of openJDK, check here.

Class Members

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public class ArrayBlockingQueue<E> extends AbstractQueue<E>
implements BlockingQueue<E>, java.io.Serializable {

/** The queued items */
final Object[] items; // Fixed size;

/** items index for next take, poll, peek or remove */
int takeIndex;

/** items index for next put, offer, or add */
int putIndex;

/** Number of elements in the queue */
int count;

/*
* Concurrency control uses the classic two-condition algorithm
* found in any textbook.
*/

/** Main lock guarding all access */
final ReentrantLock lock;
/** Condition for waiting takes */
private final Condition notEmpty;
/** Condition for waiting puts */
private final Condition notFull;
}

With these members, ArrayBlockingQueue implemented a fixed-size, multi-thread safe queue.

Main methods

For a queue, there're two main write operations: put one element into the queue, and get one element out from the queue. In multi-thread safe programs all write operations should share a Mutex. Here we're implementing our own version of three different adding method and three different polling method. They should be like

Name Behavior
add() Add an element to the queue. Return true if succeed; Throw IllegalStateException if the queue is full.
offer() Add an element to the queue. Return true if succeed; Return false if failed.
put() Add an element to the queue. Return true if succeed; Waiting if the queue is full until it's not full.
poll() Retrieves and removes the head of this queue, or returns null if this queue is empty.
take() Retrieves and removes the head of this queue, waiting if necessary until an element becomes available.

The key point here to keep the queue safe is by using a main lock and two conditions. Since all the operations are write operations, they share a Mutex.

Let's look into these methods one by one.

add()

In fact, add simply inherited it's super class's implementation.

1
2
3
4
5
6
7
public boolean add(E e) {
if (offer(e)) {
return true;
} else {
throw new IllegalStateException("Queue full");
}
}

offer()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/**
* Inserts the specified element at the tail of this queue if it is
* possible to do so immediately without exceeding the queue's capacity,
* returning {@code true} upon success and {@code false} if this queue
* is full. This method is generally preferable to method {@link #add},
* which can fail to insert an element only by throwing an exception.
*
* @throws NullPointerException if the specified element is null
*/
public boolean offer(E e) {
checkNotNull(e);
final ReentrantLock lock = this.lock;
lock.lock(); // Make sure only one thread can write to the queue
try {
if (count == items.length)
return false;
else {
insert(e);
return true;
}
} finally {
lock.unlock(); // Release the lock
}
}

put()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/**
* Inserts the specified element at the tail of this queue, waiting
* for space to become available if the queue is full.
*
* @throws InterruptedException {@inheritDoc}
* @throws NullPointerException {@inheritDoc}
*/
public void put(E e) throws InterruptedException {
checkNotNull(e);
final ReentrantLock lock = this.lock;
lock.lockInterruptibly(); // Make sure only one thread can write to the queue
try {
while (count == items.length) // If the queue is full, current thread would get hang up.
notFull.await(); // And the condition turns to await status.
insert(e);
} finally {
lock.unlock(); // Release the lock
}
}

lock() and lockInterruptibly()

As we can see from the code above, in offer(), we used lock.lock(). However in put(), we used lock.lockInterruptibly() instead. What is the difference between them?

lock.lock() would continuously trying to get the lock.
lock.lockInterruptibly() behaves almost the same as lock(), however, if it is already interrupted or is interrupted while trying to get the lock, it would throw an exception, which should be handled by it's invoker.

poll()

1
2
3
4
5
6
7
8
9
public E poll() {
final ReentrantLock lock = this.lock; // Make sure only one thread can write to the queue
lock.lock();
try {
return (count == 0) ? null : extract();
} finally {
lock.unlock(); // Release the lock
}
}

take()

1
2
3
4
5
6
7
8
9
10
11
public E take() throws InterruptedException {
final ReentrantLock lock = this.lock; // Make sure only one thread can write to the queue
lock.lockInterruptibly();
try {
while (count == 0) // If current queue is empty, current thread get hang up.
notEmpty.await(); // And the empty condition changes its status.
return extract();
} finally {
lock.unlock(); // Release the lock
}
}

insert() and extract()

These two methods basically does two thing: do the operation and notify the awaiting process.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/**
* Inserts element at current put position, advances, and signals.
* Call only when holding lock.
*/
private void insert(E x) {
items[putIndex] = x;
putIndex = inc(putIndex); // Circularly decrement i.
++count;
notEmpty.signal(); // Notify the notEmpty condition.
}

/**
* Extracts element at current take position, advances, and signals.
* Call only when holding lock.
*/
private E extract() {
final Object[] items = this.items;
E x = this.<E>cast(items[takeIndex]);
items[takeIndex] = null;
takeIndex = inc(takeIndex); // Circularly decrement i.
--count;
notFull.signal(); // Nofity the notEmpty condition.
return x;
}

Above is the main part of the ArrayBlockingQueue in Java (openJDK). It's easy to understand, and a good review of two-condition algorithm.

东京码农求职记(准备与面试)

Posted on 2019-05-11 | In Engineer |

Before Reading

这篇文章紧跟上一篇文章,主要通过作者自身经历和理解,论述如何进行求职的前期准备和实战。同样,此文将包含极强的采样误差和主观感情色彩,希望读者理解。

Hard Skill, Soft Skill

此文的每一章节都将分为两个部分叙述,分别是Hard Skill(可以量化的部分)和Soft Skill(无法量化或很难量化的部分)。

虽然因为大多数公司都把公司文化拔高到了一个不应该的高度,甚至有把公司文化道德化的倾向,导致我非常反感宣传公司文化的行为。但是我也不得不承认,有的公司文化,如果你抛却它忽悠员工的部分,它可以是非常有用的尺子,用来指导你和他人的工作。这里我推荐Amazon的Leadership Principles。这里面有一些莫名其妙的自大狂玩意儿(比如Are Right, A Lot),但是其他可以主观实施和控制的部分,是许多顶级公司对员工共同的要求,可以用来指导自己准备求职过程的方方面面。

举个小例子,许多人提到了日本总体环境佛系,想准备什么东西都没有人做。其他人都不做,你觉得这是对的,你去做了,这就是"Have backbones","Bias for Action"和"Think Big"。如果你能够影响、组织起其他人跟你一起做,这就是"Earn Trust", "Ownership"和"Hire and Develop the Best"。听起来很傻,但是这是一个优秀的码农应有的素质,也是你在简历中可以,甚至说是应该体现的素质,在面试的时候,这也是非常好的话题。

Information Hunter

Drowning in information, starving for knowledge.

就职的第一步,是获得信息,包括招聘信息,宣讲会,招聘时间表,过去问,具体公司的招聘流程,招聘时间线,招聘流程。虽然大多数日企都服从经团联的安排,一般从毕业前一年的3月开始招聘,但是码农的Dream Company普遍是不管的。这也导致码农就活的战线会拉的非常的长,以我个人的经验而言,我从投出第一份简历到拿到目前最好的offer (Amazon SDE),中间间隔了8个月,而Indeed 和Google都还在流程中。在如此长期的战线中,如何尽可能及时准确地获得信息就成了一大难题。

Hard

招聘信息最直接、及时、准确的来源肯定是各大公司官网。提前锁定自己心仪公司的网站,设置Job Alert,并且定期查看是最万无一失的方法(点名批评Microsoft, college 招聘页面曾经完全down掉了两个月之久)。

除了各大公司的招聘官网,一下这些网站都有能够较为即使准确地获得招聘信息(包括招聘会和招聘投递信息):
アカリク;
リクナビ(注意这个网站是分年度的,要选择自己对应的毕业年度);
外資就活;
CareerCross;
LinkedIn

Soft

建立你的信息网络。找到那些和你想法类似的人,主动和他们建立联系和共享信息。(Tip:收获信息的最好方式是主动提供信息)

如何找到这些人?上同一门课认识的人,留学生群,参加招聘会结识的人。只要你主动地说Hi,主动地分享你有的信息,那么信息网络的建成只不过是水到渠成了。

Resumes

了解到了招聘岗位之后,我们就要开始投递了。投递的第一步就是准备自己的简历。不要等到临投前一天才开始写简历,今天就开始写。写简历是一个不停地改进的过程,也是审视自己的不足的过程。日本留学一个不太好的部分就是,因为实习机会太少,而周围很多人都是每天吃喝玩乐,很多学生凑不出一整页的简历。如果你也是其中一员,那么抓紧时间,多做一些项目,做一些实习,打一些比赛,发一些论文,无论如何也要凑出一整页简历来。

虽然许多日企是给一个entry sheet让你填,但是填的内容和你的简历内容大同小异。而且top tier 的公司基本都是要直接收CV。

建议去一亩三分地看看如何准备和写简历。

Hard

哪些东西可以写进简历:教育经历,实习项目,开源项目,自己的side project,比赛项目,等等。

简历最好用中英日文各写一份,这样在复制粘贴到entry sheet的时候不用重复写(很多公司的ES在提交之后就不能看到了,我一开始不知道,至少写了3遍实习项目的日文版)。

虽说每份简历在投递前最好是有针对性的修改,但是对于大部分根本写不满一页的学生来说,有时间想简历怎么编,不如赶紧再去做个项目。对于能够写出一页半简历的同学来说,可以在每次投递前针对自己投递的岗位删掉一些不太契合这一岗位的项目的细节,始终保持简历上有所有的项目,重点项目重点说,而且简历长度刚好一页。

推荐通过LinkedIn初步编写自己的简历,还可以在LinkedIn上查看、学习其他人是如何编写自己的项目经历的。在出最终简历的时候,推荐使用超级简历,跟着它的引导调整自己简历的内容和排版。

Soft

同一个项目用不同的说法写出来,效果是完全不一样的。要针对性的表达你在项目中的作用。同时,为了能够在项目中有东西可说,应当在做项目之前和做项目中就想好它会在我的简历中怎样展现出来,一边意识到简历一边推进项目有奇效。

同时我发现有的转行的同学,误认为项目就是在GitHub 上找个开源项目实现一下,或者就是上了某个网课,做了课设作业。不是说这些就不行,而是这些项目缺乏区分度,不能体现你作为candidate拥有的竞争力。

相对于课设,实习经历,比赛经历和对开源社区的贡献就明显有更强的竞争力,这些经历可以说是实际工作的预演,是采用了某种技术解决了某个实际问题的过程。好消息是日本有这些经历的人真的不多,坏消息是还是比好的岗位数量多得多的。

Online Assessment

Hard

普遍是编码测试,建议在C++, Java和 Python中至少弄懂一门语言。如果你只会Python,那建议还是要学一学C++或者Java,因为Amazon东京 OA 只许用C++或者Java。

Soft

没有,即使有,也很弱智。我还没见过招码农还要求做SPI的公司。

Interviews

面试是整个招聘流程中最大的大头,而且top tier的面试强度是非常高的,Amazon一上午面4轮,Google听说也是一天面4轮。虽然一般认为面试并不能在如此短的时间里体现一个人的水平和性格,但是面试总归是比简历靠谱的。准备美资互联网公司的方法也十分简单和直接,相当多早就有志于此的朋友可能对于他们考察方法早已烂熟于胸,但是也要考虑到很多日本的留学生根本没有关注过这方面的情况,我还是决定写一写具体的面试内容和准备方法。

推荐参考书目:《剑指offer》和《Cracking the Coding Interview》。前者是中文编写,篇幅更短小,适合完全没有外企面试经验的同学熟悉面试流程和面试重点。后者是英文编写,更详细,篇幅更长,适合有一定准备的同学阅读,熟悉一些术语的英语说法和更多面试中的细节。

推荐论坛:一亩三分地。不仅有最新最全面经,还有装逼,打脸,互怼,感情生活,北美猥琐男等愉悦身心的顶级资讯,实乃居家旅行,刷题拿offer必备网站。

建议找一两个朋友定期互相做mock interview,可以习惯一边说话一边写代码,同时习惯在白板上写代码,还可以互相给feedback来提升面试过程的表现。

Hard Skill

Top tier公司都是要白板编程的,有的公司除了白板编程之外还有一些Behavior Question(BQ).白板编程主要考察算法和数据结构,new grad偶尔会问一些简单的 OOD(Object Oriented Design)问题。

算法和数据结构

究竟是我刷了题,还是题刷了我?

刷题进公司一直是一个充满争议的话题,但是不得不承认,想要进top tier公司,对于大部分人来说,刷题是最有效的途径。这里给出我的刷题记录以供参考(我本人没有任何算法竞赛的经验)。

刷题提交记录

刷题提交记录

总题数

总题数

周赛全球排名

周赛全球排名

Amazon面试前一周的周赛排名

Amazon面试前一周的周赛排名

对于没有竞赛基础的同学来说,在刚开始刷题的时候会感受到痛苦,这是十分正常的,事实上,大部分人在做到150题左右,就会开始量变到质变。

在刷了一定数量的题之后,如果想要知道自己大概是个什么水平,可以参加 Leetcode 每周的周赛,日本时间周日早上11:30至13:00。周赛是个很好的检验自己水平的地方,同时也是个获得自信的好地方,因为大多数参赛选手实在是菜的可怕。

注意:并不是说刷到这个数才能进这些公司,实际上我是刷了大量的无用功的,因为我刷到一半有点刷上瘾了,觉得做这些算法题是一件很快乐的事,没事干做两道,还因此荒废了研究。。。现在我觉得 leetcode 没太大意思,又跑去做 Hackerrank 玩,纯粹是出于个人兴趣才做了这么多。

OOD

《Cracking the Coding Interview》和一亩三分地里有很多OOD的例子,但是实话实说,遇到做过的才会,遇到没做过的还是挺蒙的。。。好在OOD并没有一个标准答案,你只要和面试官探讨出一个make sense的方案就行了。

Behavior Questions

Amazon一定会有BQ,面试之前照着Leadership Principles准备一些小故事。Google有可能有BQ,但是概率实在是很低。

Soft Skill

不同公司会对面试官进行不同的训练,要求他们重视候选人的某些方面。但是总的来说,大多数面试官评价候选者的一个很重要标准是,他是否愿意和候选者作为同事共事。因此面试很大程度上,是模拟你和面试官在工作中共同解决一个问题,想象他是你的一个同事,抱着和同事交流的心态和态度去沟通,比较容易做到不卑不亢。

自信

我自己有幸有过当面试官的经验,后来和许多其他的工程师聊的时候,发现大多数工程师都或多或少当过面试官。所以面试官其实很有可能就是一个普通的码农,面试的时候完全没有害怕的必要。

其次,不自信在面试的时候绝对是红灯信号,背后的理由也很简单:如果你自己都不相信自己可以胜任这份工作,那么你如何说服你的面试官相信你?

当然也要注意分清自信和自负的区别。你可以是一个geek,对自己的技术充满自信,但你不能是一个jerk,目空一切,一碰就炸,没人想要和jerk共事。如果你可以笑着和面试官讲一个你以前的经历中犯过的错,就像谈论自己小时候的一段逸闻一样,那么你已经展露出了充分的自信。当然,你也要记得讲自己是如何处理这个错的,获得了怎样的效果,毕竟这还是在面试中。

交流能力

许多面试题都是模糊的,不明确的,并不是因为这个题有问题,而是面试官有意考察你在工作中的沟通能力。在听到一个问题之后,尽可能的问出一切你认为可能的问题,包括但不限于:输入是否遵循某种pattern(例如一个数组是否是排好序的数组),如何处理非法的输入(包括空的输入和会overflow的输入),如果要实现多个互相影响的功能,那么他们被调用的次数分别是怎样的(这会严重影响你对于整个类的设计)。

同时,在整个思考过程中,尽量做到think aloud,把你的任何一点想法都说出来和面试官讨论,如果面试官没有提出异议,那说明你的思路是没有大的问题。同样,如果面试官明确表达了不同意,不要在自己的思路上纠结。谁对谁错不一定,但是面试官会认为你不能听取他人意见,沟通能力差。

有些朋友表达了对自己语言能力的担忧,我认为其实对于大部分人来说,对英语水平的担忧完全是杞人忧天。语言只是交流的工具,只要达到了可以沟通的水平就可以了。Google Tokyo Search Team在 tech talk上开玩笑地说"The official language here is non-native English"。如果在沟通过程中有没听明白的部分,完全可以"I beg your pardon"或者自己复述一遍面试官的话,询问自己是否理解正确了。

与公司文化的契合程度

许多公司会挑选契合公司文化的雇员,有的公司明确的给出了公司文化的要求,而有的公司隐藏的很深。面试的时候,面试者可以大概感受到公司的氛围。

我个人的建议是,不要在面试的时候过度的伪装自己,因为求职是一个双向选择的过程,如果你去到了一个你无法适应的工作环境中,你的接下来几年都会相当的痛苦:你清醒的时间里大约有一半都在工作上,而你却不喜欢它。

但是也有一些技巧可以帮助你扬长避短,体现你符合公司文化,或者说展现你的优势。例如,大多数情况下,合作项目中总会有那么几个摸鱼侠。如果你是个好好先生,容忍了摸鱼侠的存在,那么你就可以着重表达自己“Earn Trust”的部分,你的组员都相信你,或者说,至少你们没有撕破脸。如果你无法忍受摸鱼侠,和他们大吵一架并且把他们赶走了,你可以着重表达自己“Hire and Develop the Best”和“Insist on the Highest Standard”。当然你要注意自己的表达,不要让自己听起来像个自负的jerk。

最后

找工作是一个小马过河的问题,最后的最后,我依旧想要强调,这两篇文章仅仅代表我个人的经验和看法,完全不保证客观性,准确性和及时性。这两篇文章仅作抛砖引玉,希望能给各位学弟学妹提供一点点微小的帮助,同时也希望其他看到这篇博客,有自己想法的朋友,也能够通过各种渠道发声。After all, the community is you and me and everyone.

东京码农求职记

Posted on 2019-05-06 | In Engineer |

Before Reading

本文仅通过作者经历和个人看法对于东京计算机/互联网学生应届求职给出一些经验性描述和包含个人感情色彩的建议和评论,希望读者结合自身和环境的实际情况进行批判性思考。在此抛砖引玉,欢迎读者在下方评论添加更多地信息,或发布自身的经验。After all, community is you and me and everyone.

Why Write This

要恰饭的嘛! —— yyf

对于大部分在东京读计算机硕士的小码农来说,找工作几乎是每个人毕业前最头疼的事。东京又是个非常奇特的城市,同样是东大码农,收入低起年收400万,高至超越1200万,一切皆有可能。但是东京的许多大学,据我所知,学校就业处的信息可以说是一无是处,无法对小码农就职提供足够的帮助(主要还是因为学校一般把情报理工算在工学部里,和工学部一起提供信息,而忽视了计算机行业近几年翻天覆地的变化)。

同时,东京的码农职位又处于一个十分奇妙的环境下,年功序列的百年老店,咨询服务,金融技术,硅谷互联网,同时存在于东京码农的招聘市场,而且分别有一套互不兼容的招聘流程。在这篇博客中,读者很容易就会发现,我自己也并没有(也实在做不到)参与所有类型的公司招聘,仅仅能对博主自身经历过的一些求职活动进行详细的论述(并包含强烈的个人感情色彩,希望读者可以自行甄别),对于博主没有亲身经历的部分,只能通过资料和道听途说(主要来源于和其他认识的求职学生谈天说地)进行论述,对于观点的客观性,资料的准确性和时效性都不作保证。

不过由于 1.大数定律,足够多的样本中的噪声服从正态分布。 2.大多数的超线性回归模型都可以对包含正态分布噪声的样本进行无偏预测。 因此写这篇博客并非毫无意义,希望读者可以多多寻找其他资料,理性分析理性看待,数据越多,预测越准。

这篇博文将结合我自身经历,概述日本东京应届毕业生在计算机领域的求职情况,以及对如何做准备做一些微小的建议。

同时希望读者可以理解,找工作本身是一个主观性极强的小马过河式的问题,有的技巧对我来说和吃饭一样自然,但是对一些朋友来说可能是需要艰苦训练才能获得。同时对于有的朋友来说易如反掌的事情,我本人怎么都做不好。这是十分正常且普遍存在的现象,希望读者能够充分发挥主观能动性,对自己进行切实的分析。

About Me

正如前文所言,找工作本身是一个小马过河的问题,我只能尽可能地量化我自己能量化的指标,以期读者可以尽可能获得一个客观的自我评估体系。但是出于隐私考虑,依旧有一些信息我无法公布,还请谅解。

我的教育经历为:北京航空航天大学本科软件工程专业,东京大学大学院情报理工学系研究科Computer Science(CS)专攻修士(硕士)。

我有两段国内创业公司的实习经历,总时长2年。来东京之后打了一段时间的Kaggle,拿了两块铜牌,达到了Kaggle Expert。目前没有论文发表。

我的语言能力是:中文(Native Speaker(废话)),英语(托福ibt 103),日语(N1 合格),俄语(看过书听过歌,别问,问就是до свидания)。

在这一就职季中,我侥幸获得了Amazon.co.jp的 SDE offer,薪资水平高的一度让我乐得合不拢嘴。从前文可以看出,我本人算不上大神,也算不得弱鸡。这应该就是一个普通人稍作准备就能获得的offer水平。

About Some Rumors

几经思考,我决定把这一段写在最前面,因为我发现在日本计算机求职这件事情上,中文互联网的消息几乎都包含几个十分错误,甚至会对不了解的人有毁灭性误导的谣言。因此我想先把这几个很常见的谣言拿出来单独说一下,希望大家仔细辨别,好好斟酌。

文科生也能写代码

从逻辑的角度上来讲,这句话毫无问题。但是在中文互联网常见的语境下,这一句话指日本码农求职市场普遍不要求计算机专业知识,公司包培训包工作。这一点即使不能算完全错误,也与事实,或者说人们的期望相去甚远。事实上,不要求专业知识的日本企业大多数本质上和国内的外包公司没有区别,只是把自己报培训班的工夫挪到了公司里而已。但是其背后的高强度,低价值劳动的本质,通常恶劣而没有尊严的工作环境,看不到出路的职业生涯发展是如出一辙的,希望各位同学谨慎谨慎再谨慎,牢记天上没有免费的馅饼这一基本事实。

日本写代码工资就那样,和所有行业都差不多

日本应届生薪资是有名的平均主义,似乎只有在商、法、医、税这几个传统贵族行业才能免俗。然而近年来,随着日本经济和社会的改革,尤其是就职市场的改革,再加上美国互联网公司的疯狂冲击(中国带来的冲击也有一些,但不明显),码农的薪资水平开始逐年甩开其他行业,逐渐向贵族行业靠拢,因此一些努力,一些聪明的选择,还有成吨的运气,码农是能够达到让其他人瞠目结舌的应届生收入的,切不可自暴自弃。

Les Trois Mousquetaires (《三个火枪手》)

码农们的就职去向可以说很宽,也可以说很窄,取决于个人对于职业的挑剔程度。如果完全不挑剔的话,总计有四个可能的公司去向:传统日企,日本新兴互联网,外资互联网,咨询、金融的技术部。

传统日企

传统日企的IT就职和其他专业的就职没有任何不同,典型的日式平均主义。

Pros:不太可能被裁员。

Cons:其他所有。

日本新兴互联网

其实这一类才是在日本做码农的主要去向,从小一点的各类中型企业(Fixstars, mixi),到日本的大型互联网企业(LINE,DeNA,Yahoo! Japan),还有明星独角兽(Preferred Networks, Mercari),应有尽有。待遇水平也是从低到高一应俱全。

Pros: 收入普遍高于传统日企。
有一定的技术水平和技术积累。
相对健全的职业体系。

Cons: 主要面试、工作语言是日语。有的公司有英语招聘项目,但是不会日语总的来说会是减分项。
虽然在日本算是大型公司,但是和全球巨头比起来各方面差距相当大。差距从薪酬到职业生涯发展存在于几乎所有方面内。

外资互联网

在日本招聘的外资互联网少得可怜,例如Google最大的对手Facebook在日本就不招聘技术岗位。在日本进行校招的外资公司有:Google, Microsoft, Amazon, Indeed. Apple在日本主要招硬件,不怎么招软件。

Pros: 在日本打工族里算最高一级的收入水平。全球领先的技术。完善、健全、全球一流的人才培养方案和职业体系。纯英语的招聘流程和英语为主的工作环境。Transfer去其他国家、地区的机会。

Cons: 和湾区或者西雅图总部的人比起来,你的钱是比较少的,容易吃柠檬。日本分部组、话语权都有限,这些都有能制约你的职业发展。每年招聘人数很少,因此竞争很激烈(也很玄学)。

咨询、金融

包括埃森哲,高盛在内的一些咨询、金融企业。这里列出来仅仅是为了说明有这些公司在招聘,但是具体如何我完全不知道,因为我一不做咨询(不能写代码),二不去金融(就是不喜欢)。读者要想了解这方面的信息只能自己想想法。。。

What to Expect

可能是由于中文互联网上对于日本IT业界的信息普遍滞后,许多朋友对于在日本做IT的正常收入、福利水平不甚了解。正所谓知己知彼,百战不殆,博主在这里粗浅的列出一些大概的数据,同时推荐一个日本网站,叫Vokers,类似于日本的glassdoor,是雇员对于雇主的评价网站,可以在找工作和评价收入的时候进行参考。

年收

柠檬树上柠檬果,柠檬树下你和我。

以下所列薪资均为应届毕业生税前收入。

普通日企:300~400w

互联网日企:

公司名 年收
Yahoo! Japan 400~500w
Line 550~600w
DeNA 500w
CyberAgent 500w
Preferred Networks 800~1200w
Mercari 700 ~ 900w
Mixi 400w

以上数据来源模糊(样本数量太少),是我个人的印象数据。具体情况还请谨慎参考。

外资巨头:

公司名 年收
Google 900w ~ 1500w
Microsoft 700w ~ 900w
Amazon 800w ~ 900w
Indeed 1000w ~ 1200w

Work-Life Balance

总体来说,外资企业比日企要良心许多,得益于其全球统一的福利政策和科学高效的管理模式,很少能找到比外资巨头更有work-life balance的公司(除了Amazon,Oncall减分很多)。

日系企业各有各的不同。基本遵循给钱多的事儿也多的模式,但是还是不敢像国内996那样折腾,各个公司的加班时间依旧可以在Vokers上查到。需要指出的是,月均加班在40小时以上的公司已经相当高的加班度了,但是也很少有在20小时以内的公司,所以工作强度差别也并不是很大。

对于一些不太了解码农就职行情的同学,这里需要指出一个常见误区:许多人会说,薪酬不重要,重要的是学到东西和发展空间。这里把薪酬和发展对立起来,就是和现实不太符合的说法。在计算机行业,普遍而言,你最能学到东西的地方,就是那些给你钱给的最多的地方。所谓的在收入和发展中斟酌,也是指FLAANG(+Microsoft)和独角兽创业公司/即将上市的新兴巨头(Uber, Airbnb, Pinterest)中斟酌,而上述所有公司,给钱都很大方,区别只在于股票能不能发财。然而在日本,几乎没有后者可选,所以能去Microsoft, Apple, Google, Amazon, Indeed,实在是想不出不去的理由。

最后

这一篇主要对于日本IT就职进行了客观和主观的综述,希望能够对于大家了解日本IT就职有所帮助。在下一篇博客中,我会详细的叙述我个人对于如何具体准备日企就职的一些建议的和看法。但是需要提出的是,因为我个人完全偏向于美国互联网公司的就职,因此对于其他公司的就职准备可能(几乎肯定)有不了解的地方,希望读者能独立思考,积极沟通。

Kick Start2019 Round A Question A. Training

Posted on 2019-03-31 | In Computer Science |

Problem

As the football coach at your local school, you have been tasked with picking a team of exactly P students to represent your school. There are N students for you to pick from. The i-th student has a skill rating Si, which is a positive integer indicating how skilled they are.

You have decided that a team is fair if it has exactly P students on it and they all have the same skill rating. That way, everyone plays as a team. Initially, it might not be possible to pick a fair team, so you will give some of the students one-on-one coaching. It takes one hour of coaching to increase the skill rating of any student by 1.

The competition season is starting very soon (in fact, the first match has already started!), so you'd like to find the minimum number of hours of coaching you need to give before you are able to pick a fair team.

Input

The first line of the input gives the number of test cases, T. T test cases follow. Each test case starts with a line containing the two integers N and P, the number of students and the number of students you need to pick, respectively. Then, another line follows containing N integers Si; the i-th of these is the skill of the i-th student.

Output

For each test case, output one line containing Case #x: y , where x is the test case number (starting from 1) and y is the minimum number of hours of coaching needed, before you can pick a fair team of P students.

Sample

Input

1
2
3
4
5
6
7
3
4 3
3 1 9 100
6 2
5 5 1 2 3 4
5 5
7 7 1 7 7

Output

1
2
3
Case #1: 14
Case #2: 0
Case #3: 6

Solution

This is another question that requires a certain property in a fixed amount of elements among a long array. For this kind of problem, it's pretty natural to think that a sliding window would be a great solution, for most time, a sliding window algorithm only takes O(n) time.

Back to this problem, it's easy to think that if we put all the players with similar ability together would result in a smaller training time, and to do so, we want the array to be sorted.

The algorithm is split into two parts: Sort the array and track the training time required.

Say we have a window starting at index , and we know the training time for it is . We also know the the last element of this window is , and the first element is . What's the training time for the next window? The answer is . Notice that is the new highest number.

Do also be aware that the number may overflows Integer, use Long instead. The code is as below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class SolutionA {
long train(long[] nums, int n, int k) {
Arrays.sort(nums);
long window = 0l;
int i = 0;
long high = nums[k-1];
for (i=0; i<k; i++) {
window += high - nums[i];
}
long ans = window;
long prevHigh = high;
for(; i<n; i++) {
high = nums[i];
window -= prevHigh - nums[i-k];
window += (high - prevHigh) * (k-1);
ans = Math.min(ans, window);
prevHigh = high;
}
return ans;
}
}

Data Model, Special Methods

Posted on 2018-12-06 | In Computer Science |

In the last blog, we had a slight feeling about the magical things that special methods can do. This time, we'll talk deeper into how to use those special methods.

1.2 How to Use Special Methods

Specially, we have to be clear that, special methods are designed to be called by Python interpreter, and not by YOU.

In most cases, we don't write something like my_object.__len__(). Just like we mentioned, everything in Python is trying to stay in consistency, so when we write len(my_object), and the my_object is a userdefined class, Python will call the **\_len__()** instance method you implemented. Again, such consistency is the key point that Python can get rid of a huge mess.

Even more, for some built-in types like list, str, bytearray, and so on, the interpreter may take some short-cut when a special function is called. For example, the CPython implementation of len() actually returns the value of ob_size field in the PyVarObject C struct, that represents the variable-size built-in object in memory. Of course this is much faster than calling a method.

More often than not, the calling of special methods is implicit. Like for i in x:, indeed it cause the invocation of iter(x), which in turn may call __iter__() if it's available. (This also indicates that, if we implemented our own __iter__() in our user-defined class, we may use for i in my_object:), see an example!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
>>> class MyClass:
... def __init__(self, n):
... self.a = []
... for i in range(0, n):
... self.a.append(i)
...
... def __iter__(self):
... curr = 0
... while curr < len(self.a):
... yield self.a[curr]
... curr += 1
...
>>> my_object = MyClass(3)
>>>
>>> for ele in my_object:
... print(ele)
...
0
1
2

Simply by implementing a __iter__(), we made our user-defined class supporting the most frequently used built-in syntax.

Normally, your code should not call those special methods directly too often. Unless you're doing metaprogramming, you should be implementing those special methods much more often than invoking them explicitly. Perhaps the only special methods that you may call frequently is the __init__(), to invoke the initializer of a superclass in your own __init__() implementation.

If you want to use a special method, it is recommended that call them by calling the built-in functions (e.g, len, iter, str, etc). There functions does the job you want them to do, normally with some extra benefits. And for some built-in class, they can be faster (like the len() we introduced just now).

Besides, please avoid adding custom special methods arbitrarily. The name like __foo()__ may haven't been occupied now, but who knows about the future?

1.2.1 Emulating Numeric Types

Several special methods allow user object to respond to operators such as +. We'll go through a simple example to see how special methods works.

We'll implement a class to represent two-dimensional vectors (and more dimensions in the future). The built-in complex type can be used to represent two dimensional vectors, but when talking about more dimensions, we'll need our own class, and we want them to support built-in operators. Here're how we do it.

We will start by designing the API for such a class by writing a simulated console session that we use later as a doctest. Here's what we want.

1
2
3
4
>>> a = Vector(1, 2)
>>> b = Vector(2, 1)
>>> a + b
Vector(3, 3)

The abs built-in function returns the absolute value of integers and floats, and magnitude of complex numbers. So to be consistent, our API should use abs to calculate the magnitude of a vector:

1
2
3
>>> v = Vector(3, 4)
>>> abs(v)
5.0

And also the * operator for scalar multiplication (Notice, this is not a dot multiplication for two vectors)

1
2
>>> v * 3
Vector(9, 12)

Also notice that, the printed information is under user definition. It's pretty simple to make it, just implement a special method called __repr__().

So, all in all, in order to make all the functions above work, the special methods we need to implement are: __repr__(), __abs__(), __add__(), and __mul__()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Vector:

def __init__(self, x=0, y=0):
self.x = x
self.y = y

def __repr__(self):
return "Vector(%r, %r)" % (self.x, self.y)

def __abs__(self):
return hypot(self.x, self.y)

def __bool__(self):
return bool(abs(self))

def __add__(self, other):
x = self.x + other.x
y = self.y + other.y
return Vector(x, y)

def __mul__(self, scalar):
return Vector(self.x * scalar, self.y * scalar)

Although there 6 special methods in the code, but none of them will be invoked by in it's own code( except for __init__()). Even some other code may want to use these methods, they don't explicitly call them, just like what we can see in the console listings. Then let's discuss the code for each special method.

1.2.2 String representation.

The __repr__() is called by the repr built-in to get the string presentation of the instance for inspection. If we didn't implement it, the string returned would look like something like this:

1
2
>>> Vector(1,1)
<Vector object at 0x10e380400>

The string returned by __repr__() should be unambiguous, and, if possible, match the source code necessary to re-create the object being represented. This is why the our __repr__() would return a string looks calling the constructor of the class.

There's a similar special method called __str__(), which is called by the str() constructor and implicitly used by the print function. Basically, __str__() should return a string suitable for display to end users.

When you only implement one of the both, implement __repr__(), because Python will call __repr__() when __str__() is not available as a fallback.

1.2.3 Arithmetic Operators

We've already seen how to implement our own + and by implementing __add__ and *__mul__. Note that, in both cases, the original object is not changed, and the method returned a new instance of the class. The principal of infix operators is to create new objects and leave the parameters untouched.

1.2.4 Boolean Value of a Custom Type

Although we have boolean type in Python, almost any object can be applied to a boolean context, such as a expression controlled by if or while. To do so, python will call bool(x), and it only returns either True or False.

By default, an instance of our self-defined class is always True, unless we implemented the __bool__ or __len__. bool(x) basically calls x.__bool__() and returns the result. If it's not implemented, Python would try to invoke x.__len__() instead. If it returns 0, then False; Otherwise, True.

More Special Methods.

The special methods are list in Python official site. Check them!

Data Model in Python

Posted on 2018-10-28 | In Computer Science |

One of the best qualities of Python is its consistency.

Luciano Ramalho, author of Fluent Python: Clear, Concise, and Effective Programming

By the first time I read this sentence, all I felt was wired. As we all know, Python is a dynamically-typed language, you can never succeed in guessing the class or type of a variable. So at least, in terms of type, Python is far from consistent. However, soon I understand the meaning of consistency in Python: Nothing is special. Here, we're going to see several examples and understand what "consistency" really means in Python.

Note: In Python, we're going to meet a lot of methods that looks like __getitem__, and we'll call them "dunder_getitem" (dunder means double-under). And also notice that, double under like __x has other meanings in Python, and please don't mess up with them.

It's very important to implement those methods that can be called by the framework itself, we're going to show you how to implement two special methods called __getitem__ and __len__.

1.1 Pythonic Cards

Code 1-1: Cards with order.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import collections

card = collections.namedtuple('Card', ['rank', 'suit'])

class FrenchDeck:
ranks = [str(n) for n in range(2, 11)] + list('JQKA')
suits = 'spades diamonds clubs hearts'.split()

def __init__(self):
self._cards = [Card(rank, suit) for suit in self.suits
for rank in self.ranks]

def __len__(self):
return len(self._cards)

def __getitem__(self, position):
return self._cards[position]

Now, let's look at the FrenchDeck class. It's short and powerful. First of all, like any Python standard collection class, we can use len() function to check how many cards in it:

1
2
3
>>> deck = FrenchDeck()
>>> len(deck)
52

We can also pick any specific card in the cards using [ ] like the following code:

1
2
3
4
>>> deck[0]
Card(rank='2', suit='spades')
>>> deck[-1]
Card(rank='A', suit='hearts')

The more interesting thing is that, we don't need to write a new function to randomly pick a card, Python already had a function called random.choice to do the job, and we can just use it on the instance of our cards.

1
2
3
>>> from random import choice
>>> choice(deck)
Card(rank='4', suit='spades')

See! We already benefit from implementing special methods to use Python data model:

  • The users of your class doesn't need to remember many different names of a standard manipulation (sizeof() or length() or something else).
  • Just use Python standard library, no need for re-inventing the wheels.

Even more, since __getitem__ hands [ ] manipulation to self._cards list, so our deck class automatically supports slicing. See below:

1
2
3
4
5
6
7
8
9
>>> deck[: 3]
[Card(rank='2', suit='spades'),
Card(rank='3', suit='spades'),
Card(rank='4', suit='spades')]
>>> deck[12:: 13]
[Card(rank='A', suit='spades'),
Card(rank='A', suit='diamonds'),
Card(rank='A', suit='clubs'),
Card(rank='A', suit='hearts')]

The second function means that, pick all cards that has index 12, and pick one in every 13 cards.

And even more! Just by implementing __getitem__ method, the cards become iterable:

1
2
3
4
5
6
7
>>> for card in deck: # doctest: +ELLIPSIS
... print(card)
Card(rank='2', suit='spades')
Card(rank='3', suit='spades')
Card(rank='4', suit='spades')
Card(rank='5', suit='spades')
...v

Just by implementing __getitem__ and _\len__ methods, FrenchDeck class is just like any sequential data model in Python, showing core features of Python (like iterating and slicing). It can also be used for standard libraries like random.choice, reversed and sorted.

Here, we see how consistency exists in Python, and definitely we can write more elegant and pythonic code by implementing these special methods.

However, our cards still cannot be shuffled so far. We'll not talk about it now, although it's easy enough to be implemented by a single line of code in __setitem__, it influences more important things and we'll talk about them together.

Python, Deeper Understanding

Posted on 2018-10-28 | In Computer Science |

Here's the plan: when someone uses a feature you don't understand, simply shoot them. This is easier than learning something new, and before too long the only living coders will be writing in an easily understood, tiny subset of Python 0.9.6 (wink).

Tim Peters, Legendary Core developer and author of The Zen of Python


Sadly, I've been one of those "shooters" for a long time. I've been writing Python code for around four years on many different tasks. From back-end server that provide service to many clients, to playing Kaggle competitions, Python has always been one of the most frequently used languages and I do really like it in most cases (however, there're also some cases that I really messed up with it). Looking back in those code I've written, I start to think, are they elegant enough? The answer is definitely NO. The python provided multiple interesting and power features that is different from C++ and JAVA, but I never used them.

Since I would work with Python for a long time in the coming future, I'd better learn more about Python and try to write more pythonic code. Besides, the Zen of Python is influencing the design of many other languages, understanding more about the successful language would benefit those who like and enjoy programming.

This series would be a memo and conclusion of Python, basically Python 3.6, mainly talking about more advanced Python features, so before reading it, you are thought to have experience of programming Python, be able to write, read and understand basic Python code. This series is for those who want to go further and write more elegant (pythonic?) code. Discussion and comments are welcomed.

12

Zhao Fengbo

16 posts
5 categories
18 tags
GitHub E-Mail
© 2020 Zhao Fengbo
Powered by Hexo
|
Theme — NexT.Mist v6.0.6