#InterviewNote

Union Find

From wiki:
In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that keeps track of a set of elements partitioned into a number of disjoint (nonoverlapping) subsets. It supports two useful operations:
Find: Determine which subset a particular element is in. Find typically returns an item from this set that serves as its “representative”; by comparing the result of two Find operations, one can determine whether two elements are in the same subset.
Union: Join two subsets into a single subset.

Example:
261. Graph Valid Tree
Given n nodes labeled from 0 to n - 1 and a list of undirected edges (each edge is a pair of nodes), write a function to check whether these edges make up a valid tree.

For example:
Given n = 5 and edges = [[0, 1], [0, 2], [0, 3], [1, 4]], return true.
Given n = 5 and edges = [[0, 1], [1, 2], [2, 3], [1, 3], [1, 4]], return false.

Hint:
Given n = 5 and edges = [[0, 1], [1, 2], [3, 4]], what should your return? Is this case a valid tree?
According to the definition of tree on Wikipedia: “a tree is an undirected graph in which any two vertices are connected by exactly one path. In other words, any connected graph without simple cycles is a tree.”

Note: you can assume that no duplicate edges will appear in edges. Since all edges are undirected, [0, 1] is the same as [1, 0] and thus will not appear together in edges.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public class Solution {
public boolean validTree(int n, int[][] edges) {
UnionFind uf = new UnionFind(n);
for (int i = 0; i < edges.length; i++) {
if (!uf.union(edges[i][0], edges[i][1])) {
return false;
}
}
return uf.size == 1;
}

private class UnionFind {
int size;
int[] nodes;

UnionFind(int size) {
this.size = size;
this.nodes = new int[size];
for (int i = 0; i < size; i++) {
nodes[i] = i;
}
}

boolean union(int a, int b) {
int label_a = nodes[a];
int label_b = nodes[b];
if (label_a == label_b) {
return false;
} else {
for (int i = 0; i < nodes.length; i++) {
if (nodes[i] == label_a) {
nodes[i] = label_b;
}
}
size--;
return true;
}
}
}
}

Read More

Introduction

ASCII

Single byte encoding only using the bottom 7 bits(0-127). The top bit is always 0.

In English, 128 symbols are enough to represent all character. But in other situations, French for example, they are insufficient. So we use the top bit to represent accents so that there are up to 256 characters. ‘é’ encodes as 1000 0010(130).
But here comes another problem. In different languages, the same binary encoding represents different characters, such as 130 in French is é, but in Hebrew, it is Gimel(ג). Not to mention Chinese characters(more than 100 thousand). So we introduce another encoding system, unicode.

Unicode

“Unicode encoding” is more properly known as UTF-16: 2 bytes per “code point”. This is the native format of strings in .NET. Values outside the Basic Multilingual Plane(BMP) are encoded as surrogate pairs. These are relatively rarely used - which is a good job, as very few developers get them right, I suspect. “Unicode” is really the character set - it is unfortunate that the term is also used as a synonym for UTF-16 in .NET and various Windows applications.
Unicode can be implemented by different character encodings. The most commonly used encodings are UTF-8, UTF-7 and UTF-32:
UTF-8: Variable length encoding, 1-4 bytes covers every current character. ASCII values are encoded as ASCII.
UTF-7: Usually used for mail encoding. Chances are if you think you need it and you’re not doing mail, you’re wrong. (not widely used at all.)
UTF-32: Fixed width encoding using 4 bytes per code point. This isn’t very efficient, but makes life easier outside the BMP.

UTF-8

UTF-8 has become the dominant character encoding for the World Wide Web. The rule of UTF-8 is:
(1) If this is in ASCII, UTF-8 is the same with ASCII
(2) For n UTF bytes(n > 1), the first n bits in the first byte set as 1, the n + 1 bit sets as 0, all the first two bits in the following bytes are all 10. The rest of the bits are represented as the unicode of the character.

UTF Bytes Hexadecimal Binary
1 0000 0000 to 0000 007F 0xxxxxxx
2 0000 0080 to 0000 07FF 110xxxxx 10xxxxxx
3 0000 0800 to 0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx
4 0001 0000 to 0010 FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

For example, the unicode of “严” is 4E25 (100111000100101). According to the table above, this character belongs to the third row. So its UTF-8 is E4B8A5 (11100100 10111000 10100101).
We can see that actually unicode is different with UTF-8. But there are some libraries that can do the convert.

Read More

Single Number

XOR features:
a ^ b = c
a ^ c = b
b ^ c = a
a ^ 0 = a
a ^ a = 0
(a ^ b) ^ c = a ^ (b ^ c)

  • I: all the numbers appear twice except one
    XOR all the numbers, and the result is the one
  • II: all the numbers appear three times except one
    Use a int[32] to store each bit, if the number of this bit can be mod by 3, we set it as 0, otherwise we set it as 1.
  • III: all the numbers appear twice except two
    XOR all the numbers so we can get the two by XOR. Since these two elements are not the same, there is at least one bit that different. We find this different position and XOR all the elements which have 1 on position different, the result is one of the two. Finally, we XOR the found one with XOR-ed two, we can get another one.

Majority Number

  • I: find the number appears more than half
    Since it is a strict majority number, we only need to maintain a variable result and a counter. Initially, when the counter faces with a different number, minus one of counter and change the result to this new number, otherwise we only need to add 1 to the counter. And finally, the result we maintained is the result. (We need to do another pass to make sure this number is the one we are looking for)
  • II: find the number appears more than 1/3
    Similar with I, we maintain two variables and two counters so that we can get most appeared two numbers. And then we put these two numbers back to the array to find out which one appears more than 1/3. (We can’t use counter to find out, because counters have minus operation during the traversal)
  • III: find the number appears more than 1/k
    We maintain a HashMap<number, counter>. Since if majority, each k different numbers should have more than one element. When we get k entries in the map, we remove entries which values are 1. Finally, find the entry that have highest value(count), that key should be the majority number. In this way, we can implement this in O(n) time and O(k) extra space.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    public class Solution {
    public int majorityNumber(ArrayList<Integer> nums, int k) {
    if(nums == null || nums.size() == 0) {
    return -1;
    }

    HashMap<Integer, Integer> map = new HashMap<Integer, Integer>();
    for(int num : nums) {
    if(map.containsKey(num)) {
    map.put(num, map.get(num) + 1);
    } else {
    // if there are k entries, check if there are any entries that
    // its value == 1, that is not satisfied with "more than 1/k"
    if(map.size() == k) {
    Iterator<Map.Entry<Integer, Integer>> iter = map.entrySet().iterator();
    while(iter.hasNext()) {
    Map.Entry<Integer, Integer> entry = iter.next();
    if(entry.getValue() - 1 == 0) {
    // can't use: map.remove(entry.getKey());
    // above wrong code would lead the iter can't find its next !!!
    iter.remove();
    } else {
    map.put(entry.getKey(), entry.getValue() - 1);
    }
    }
    } else {
    map.put(num, 1);
    }
    }
    }

    // find the one that have highest value(count), that key should be the majority number
    int value = 0;
    int result = -1;
    for(Map.Entry<Integer, Integer> entry : map.entrySet()) {
    if(entry.getValue() > value) {
    value = entry.getValue();
    result = entry.getKey();
    }
    }
    return result;
    }
    }

Read More

Data Structure is a way to organize data. It provides some methods to handle data stream, e.g. insert, delete, etc.

Linear Data Structure

Queue & Stack

Min Stack
Use two stacks, one is storing the input, when calling pop() or peek(), pop from another stack, which stores the minimum values from top to bottom.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
class MinStack {
ArrayList<Integer> stack = new ArrayList<Integer>();
ArrayList<Integer> minStack = new ArrayList<Integer>();
public void push(int x) {
stack.add(x);
if(minStack.isEmpty() || minStack.get(minStack.size() - 1) >= x) {
minStack.add(x);
}
return;
}

public void pop() {
if(stack.isEmpty()) {
return;
}
int elem = stack.remove(stack.size() - 1);
if(!minStack.isEmpty() && minStack.get(minStack.size() - 1) == elem) {
minStack.remove(minStack.size() - 1);
}
return;
}

public int top() {
if(!stack.isEmpty()) {
return stack.get(stack.size() - 1);
}
return 0;
}

public int getMin() {
if(!minStack.isEmpty()) {
return minStack.get(minStack.size() - 1);
}
return 0;
}
}

Implement Queue by stacks
Use two stacks, one is for storing elements. When calling pop() or top(), pop the elements from the first stack and push them into the second one.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public class Solution {
private Stack<Integer> stack1;
private Stack<Integer> stack2;

public Solution() {
// do initialization
stack1 = new Stack<Integer>();
stack2 = new Stack<Integer>();
}

public void push(int element) {
stack1.push(element);
}

public int pop() {
if(stack2.isEmpty()) {
while(!stack1.isEmpty()) {
stack2.push(stack1.pop());
}
}
return stack2.pop();
}

public int top() {
if(stack2.isEmpty()) {
while(!stack1.isEmpty()) {
stack2.push(stack1.pop());
}
}
return stack2.peek();
}
}

Largest Rectangle in Histogram
Brute force: totally O(n^2) windows and O(n) to find the minimun one in each window, so we need O(n^3) time.
Improve: for each number, search both side of it until find two smaller number, calculate the result, O(n^2) cost.
Best: use a stack to store the index (all increased heights), when faced with a smaller one, pop out from tha stack and calculate the area until the value bigger than the peek() one. This method cost O(n) time, with O(n) space in worst case (two passes).

Read More

Clone Graph

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/**
* Definition for undirected graph.
* class UndirectedGraphNode {
* int label;
* ArrayList<UndirectedGraphNode> neighbors;
* UndirectedGraphNode(int x) { label = x; neighbors = new ArrayList<UndirectedGraphNode>(); }
* };
*/
public class Solution {
public UndirectedGraphNode cloneGraph(UndirectedGraphNode node) {
if(node == null) {
return null;
}

LinkedList<UndirectedGraphNode> queue = new LinkedList<UndirectedGraphNode>();
HashMap<UndirectedGraphNode, UndirectedGraphNode> map = new HashMap<UndirectedGraphNode, UndirectedGraphNode>();
queue.offer(node);
map.put(node, new UndirectedGraphNode(node.label));

while(!queue.isEmpty()) {
UndirectedGraphNode cur = queue.poll();
for(int i = 0; i < cur.neighbors.size(); i++) {
// !!! if node not in map, add a new node into the map !!!
if(!map.containsKey(cur.neighbors.get(i))) {
map.put(cur.neighbors.get(i), new UndirectedGraphNode(cur.neighbors.get(i).label));
queue.offer(cur.neighbors.get(i));
}
// now we are sure that the node is existed, add the neighbors to the value
map.get(cur).neighbors.add(map.get(cur.neighbors.get(i)));
}
}

return map.get(node);
}
}

Topological Sorting
DFS: O(n) time with O(n) space for the map and the result.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/**
* Definition for Directed graph.
* class DirectedGraphNode {
* int label;
* ArrayList<DirectedGraphNode> neighbors;
* DirectedGraphNode(int x) { label = x; neighbors = new ArrayList<DirectedGraphNode>(); }
* };
*/
public class Solution {
/**
* @param graph: A list of Directed graph node
* @return: Any topological order for the given graph.
*/
public ArrayList<DirectedGraphNode> topSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> result = new ArrayList<DirectedGraphNode>();
if(graph == null || graph.size() == 0) {
return result;
}
// construct map with all nodes
HashMap<DirectedGraphNode, Integer> map = new HashMap<DirectedGraphNode, Integer>();
for(DirectedGraphNode node: graph) {
// mark 0 as unsorted
map.put(node, 0);
}
// find a new unsorted node to start sorting (if possible):
while (hasUnsorted(map, graph)) {
DirectedGraphNode node = null;
for (DirectedGraphNode temp : graph) {
if (map.get(temp) == 0) {
node = temp;
}
}
// get the node and do sort(search):
sort(map, graph, result, node);
}
return result;
}

// check if there is any node that not yet been sorted
public boolean hasUnsorted(Map<DirectedGraphNode, Integer> map, ArrayList<DirectedGraphNode> graph){
for (DirectedGraphNode node : graph) {
if (map.get(node) == 0) {
return true;
}
}
return false;
}

// search and sort the graph
public void sort(Map<DirectedGraphNode, Integer> map, ArrayList<DirectedGraphNode> graph, ArrayList<DirectedGraphNode> result, DirectedGraphNode node){
if (map.get(node) != 0) {
// if 1: System.out.println("It is not a DAG");
// if 2: sorted
return;
}
// mark 1 as visited(not yet been sorted), do with its neighbors:
map.put(node, 1);
for (DirectedGraphNode next : node.neighbors) {
sort(map, graph, result, next);
}
// mark 2 as sorted
// map.put(node, 2);
result.add(0, node);
}
}

Read More

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×