Automate deploying this blog using Github action

了解github action后,第一步想到的是改造下这个博客。利用github action的工作流,做到直接在github上编辑markdown,保存后,触发一个设定的workflow,做到编译,部署编译好的静态网站到GitHub pages。 这里面关键是,如何部署编译好的静态网站到Github Pages。我用的是hexo这个静态网站生成器,代码是在一个独立repo,与要部署的目标repo不同。在GitHub的runner也就是服务器上,如何设置github credentials,从而能从那里部署是关键。 几番搜索发现了有个现成的github action: hexo-action。简单总结下: Generate a deploy key pair. ssh-keygen -t rsa -C "username@example.com" Public key is saved in github page repo’s Settings/Deploy Keys. Private key is saved as a secret in source repo’s Settings/Secrets, e.g., named DEPLOY_KEY. Add the github workflow: name: Blog CICD on: push: branches: - 'main' jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: submodules: true - name: Use Node.js uses: actions/setup-node@v3 with: node-version: '20.x' - name: Install dependencies run: npm install hexo - name: Deploy id: deploy uses: sma11black/hexo-action@v1.0.3 with: deploy_key: ${{ secrets.DEPLOY_KEY }} user_name: github_action 这是第一篇自动部署的文章。Let’s see if it works. ...

January 7, 2024 · 1 min · 103 words

Recap of Scala tutorial

A cheatsheet that gets you started. The original tutorial can be found here Main function The normal version: object Hello { def main(args: Array[String]) = { println("Hello, world") } } The version that extends App trait object Hello extends App { println("Hello, world") } // To access args. object Hello extends App { if (args.size == 0) println("Hello, you") else println("Hello, " + args(0)) } As you may notice: no ; needed as ending in Scala. Variables The type in declaration is optional. val s: String = "hello" // val for immutable, final in Java. val s = "hello" // Same as above. var i: Int = 42 // var for mutable. var i = 42 // Same as above. Built-in types: val b: Byte = 1 // 8-bit signed [-128, 127](-2^7 to 2^7-1, inclusive) val s: Short = 1 // 16-bit signed. val x: Int = 1 // 32-bit signed. val l: Long = 1 // 64-bit signed. val f: Float = 3.0 // 32-bit IEEE 754 single-precision float. val d: Double = 2.0 // 64-bit IEEE 754 double-precision float. val c: Char = 'a' // 16-bit unsigned Unicode character. val s: String = "str" // A sequence of Char BigInt and BigDecimal for large numbers: ...

July 12, 2021 · 14 min · 2835 words

详解水塘抽样(Reservoir sampling)

一个场景 当内存无法加载全部数据时,如何从包含未知大小的数据流中随机选取k个数据,并且要保证每个数据被抽取到的概率相等。 水塘抽样 把上述问题具体化为给定一个未知大小数组data[],从中选取k个数,要求每个数被选中的概率相等。假设数组长度为n,则每个数被选中概率应为k/n。 策略是: 对最前面k个数,直接放到数组chosen[]即chosen[i] = data[i] 0 <= i < k。 对data[i], i >= k,先生成个[0...i)间的随机数(左闭右开),假设rand(0, i) = j,则如果j < k,则做替换chosen[j] = data[i]。否则,nothing happens. 如此反复直到结束。 Java实现如下: public class ReservoirSampling { private Random rand = new Random(); public int[] sampling(int[] data, int k) { int[] chosen = new int[k]; int i = 0; for (int d : data) { if (i < k) { chosen[i] = d; } else { int j = rand.nextInt(i); if (j < k) { chosen[j] = d; } } ++i; } return chosen; } } 每个元素被选中概率相等的证明如下: ...

March 24, 2021 · 3 min · 440 words

Java集合速览

A quick review of Java Collection and Map key methods. Array based collections usually have: Method Time complexity add O(1) insert(int index) O(N) remove O(1) remove(int index) O(N) find(Object value) O(N) PriorityQueue, TreeSet and TreeMap are implemented based on balanced tree, thus: Method Time complexity offer O(logN) poll O(logN) peek O(1) Overview via a diagram Collections extends Iterable which has below key methods: Method Description boolean hasNext() This method returns true if the iterator has more elements. object next() It returns the element and moves the cursor pointer to the next element. void remove() This method removes the last elements returned by the iterator. List methods Method Description void add​(int index, E element) Inserts the specified element at the specified position in this list (optional operation). boolean add​(E e) Appends the specified element to the end of this list (optional operation). boolean addAll​(Collection<? extends E> c) Appends all of the elements in the specified collection to the end of this list, in the order that they are returned by the specified collection’s iterator (optional operation). Eget​(int index) Returns the element at the specified position in this list. E set​(int index, E element) Replaces the element at the specified position in this list with the specified element (optional operation). booleanisEmpty() Returns true if this list contains no elements. int size() Returns the number of elements in this list. <T> T[] toArray​(T[] a) Returns an array containing all of the elements in this list in proper sequence (from first to last element); the runtime type of the returned array is that of the specified array. E.g., String[] y = x.toArray(new String[0]); Queue methods Method Description void add(E e) Inserts the specified element into this queue if it is possible to do so immediately without violating capacity restrictions, returning true upon success and throwing an IllegalStateException if no space is currently available. E element() Retrieves, but does not remove, the head of this queue. boolean offer​(E e) Inserts the specified element into this queue if it is possible to do so immediately without violating capacity restrictions. E peek() Retrieves, but does not remove, the head of this queue, or returns null if this queue is empty. E poll() Retrieves and removes the head of this queue, or returns null if this queue is empty. E remove() Retrieves and removes the head of this queue. Dequeue / LinkedList Methods Method Description void addFirst​(E e) Inserts the specified element at the front of this deque if it is possible to do so immediately without violating capacity restrictions, throwing an IllegalStateException if no space is currently available. void addLast​(E e) Inserts the specified element at the end of this deque if it is possible to do so immediately without violating capacity restrictions, throwing an IllegalStateException if no space is currently available. boolean offerFirst​(E e) Inserts the specified element at the front of this deque unless it would violate capacity restrictions. boolean offerLast​(E e) Inserts the specified element at the end of this deque unless it would violate capacity restrictions. E peekFirst() Retrieves, but does not remove, the first element of this deque, or returns null if this deque is empty. E peekLast() Retrieves, but does not remove, the last element of this deque, or returns null if this deque is empty. E pollFirst() Retrieves and removes the first element of this deque, or returns null if this deque is empty. E pollLast() Retrieves and removes the last element of this deque, or returns null if this deque is empty. E pop() Pops an element from the stack represented by this deque. void push​(E e) Pushes an element onto the stack represented by this deque (in other words, at the head of this deque) if it is possible to do so immediately without violating capacity restrictions, throwing an IllegalStateException if no space is currently available. E removeFirst() Retrieves and removes the first element of this deque.` E removeLast() Retrieves and removes the last element of this deque. Summary of Deque methods First Element (Head) Last Element (Tail) Throws exception Special value Throws exception Special value Insert addFirst(e) offerFirst(e) addLast(e) offerLast(e) Remove removeFirst() pollFirst() removeLast() pollLast() Examine getFirst() peekFirst() getLast() peekLast() Comparison of Queue and Deque methods Queue Method Equivalent Deque Method add(e) addLast(e) offer(e) offerLast(e) remove() removeFirst() poll() pollFirst() element() getFirst() peek() peekFirst() Comparison of Stack and Deque methods Stack Method Equivalent Deque Method push(e) addFirst(e) pop() removeFirst() peek() getFirst() Method Description boolean add​(E e) Adds the specified element to this set if it is not already present. TreeSet methods Method Description boolean add​(E e) E ceiling​(E e) Returns the least element in this set greater than or equal to the given element, or null if there is no such element. E first() Returns the first (lowest) element currently in this set. E floor​(E e) Returns the greatest element in this set less than or equal to the given element, or null if there is no such element. SortedSet headSet​(E toElement) Returns a view of the portion of this set whose elements are strictly less than toElement. NavigableSet headSet​(E toElement, boolean inclusive) Returns a view of the portion of this set whose elements are less than (or equal to, if inclusive is true) toElement. E higher​(E e) Returns the least element in this set strictly greater than the given element, or null if there is no such element. E last() Returns the last (highest) element currently in this set. E lower​(E e) Returns the greatest element in this set strictly less than the given element, or null if there is no such element. E pollFirst() Retrieves and removes the first (lowest) element, or returns null if this set is empty. E pollLast() Retrieves and removes the last (highest) element, or returns null if this set is empty. NavigableSet subSet​(E fromElement, boolean fromInclusive, E toElement, boolean toInclusive) Returns a view of the portion of this set whose elements range from fromElement to toElement. SortedSet subSet​(E fromElement, E toElement) Returns a view of the portion of this set whose elements range from fromElement, inclusive, to toElement, exclusive. SortedSet tailSet​(E fromElement) Returns a view of the portion of this set whose elements are greater than or equal to fromElement. NavigableSet tailSet​(E fromElement, boolean inclusive) Returns a view of the portion of this set whose elements are greater than (or equal to, if inclusive is true) fromElement. Map methods Method Description V get​(Object key) Returns the value to which the specified key is mapped, or null if this map contains no mapping for the key. V put​(K key, V value) Associates the specified value with the specified key in this map (optional operation). boolean containsKey​(Object key) Returns true if this map contains a mapping for the specified key. boolean containsValue​(Object value) Returns true if this map maps one or more keys to the specified value. Set<Map.Entry<K,​V» entrySet() Returns a Set view of the mappings contained in this map. Set keySet() Returns a Set view of the keys contained in this map. Collection values() Returns a Collection view of the values contained in this map. Default methods: ...

March 7, 2021 · 9 min · 1822 words

详解线段树之进阶(Segment Tree)

懒更新(lazy update) 上文留下一个问题没有解决,就是对一个区间更新。对一个元素的更新复杂度为O(N),N为数组长度,按上文方法对区间更新,则复杂度为L*O(N), L为区间长度。 另外,更新原数组元素,是从线段树中对应的叶子节点,一路向上更新到根节点。如果所要更新的区间有很多公共祖先节点,那么这些节点每次都需要被更新一次。如图中如果我们要更新数组中的2, 8, 6分别+5, +1, +9,可以看到图中公共节点如19会更新2次,37则有3次。 另外,在实际应用中,可能只是某些热点数据会频繁被读取,而大部分数据是在冷宫常年不见天日。有些更新没有必要立即落实到每一个节点,而可以在其被读取时再落实更新。 这就是懒更新的登场时刻。 基于数组的实现 既然需要将更新延后落实,那么就需要保存这些更新,所以需要一个与tree长度一致的lazy数组来保存对应节点没有落实的更新。代码如下。几个注意点: 进入queryCore时,首先需要检查有无当前节点index的更新,如果有,先执行更新,清空该更新,然后继续,其余与正常线段树query区别。 进入updateCore时,也是首先要检查有无当前节点index的更新,如果有,重复上面。 另外,当更新范围[left, right]完全覆盖当前节点代表区间[segLeft, segRight]时,要再次执行更新,但这次update参数是newVal。这是关键点所在,执行完更新,就不用继续递归调用了,因为孩子节点的更新被保存在了lazy[leftChild]和lazy[rightChild]。下次再调用query或者update时,如果执行到节点leftChild或rightChild,则会执行之前保存的更新。 class SegmentTree { private int[] tree; private int[] lazy; private int n; public SegmentTree(int n) { this.n = n; int len = (1 << (1 + (int)(Math.ceil(Math.log(n) / Math.log(2))))) - 1; tree = new int[len]; lazy = new int[len]; } public int query(int left, int right) { return queryCore(left, right, 0, n - 1, 0); } public void update(int left, int right, int newVal) { updateCore(left, right, 0, n - 1, 0, newVal); } private int queryCore(int left, int right, int segLeft, int segRight, int index) { if (lazy[index] != 0) { // Execute un-merged update. execute(index, lazy[index], segLeft, segRight); // Clear merged update. lazy[index] = 0; } if (left > segRight || right < segLeft) { return 0; } if (left <= segLeft && segRight <= right) { return tree[index]; } int mid = getMid(segLeft, segRight); return merge(queryCore(left, right, segLeft, mid, index * 2 + 1), queryCore(left, right, mid + 1, segRight, index * 2 + 2)); } private void updateCore(int left, int right, int segLeft, int segRight, int index, int newVal) { if (lazy[index] != 0) { execute(index, lazy[index], segLeft, segRight); lazy[index] = 0; } if (left > segRight || right < segLeft) { return; } if (left <= segLeft && segRight <= right) { // Execute update for newVal. execute(index, newVal, segLeft, segRight); return; } int mid = getMid(segLeft, segRight); updateCore(left, right, segLeft, mid, index * 2 + 1, newVal); updateCore(left, right, mid + 1, segRight, index * 2 + 2, newVal); tree[index] = merge(tree[index * 2 + 1], tree[index * 2 + 2]); } // Execute an update against tree[index] with interval [segLeft, segRight]. private void execute(int index, int update, int segLeft, int segRight) { // Apply the udpate. tree[index] = merge(tree[index], update); if (segLeft != segRight) { // Propagate the update to children. lazy[2 * index + 1] = merge(lazy[2 * index + 1], update); lazy[2 * index + 2] = merge(lazy[2 * index + 2], update); } } private int merge(int x, int y) { return Math.max(x, y); } private int getMid(int left, int right) { return left + (right - left) / 2; } } 基于树的实现 其实就是把上面的代码做一次翻译,不多赘述。 ...

March 5, 2021 · 5 min · 986 words

详解线段树之入门(Segment Tree)

何为线段树 线段树是基于要研究的数组而构建的数据结构,呈二叉树状。线段树的叶子结点就是所要研究的数组,非叶子节点表示数组的一段区间的某个性质的值。比如此区间的和,或者最大值最小值等等。 为什么需要线段树呢?试想你需要设计一个功能,给定一个数组,需要你提供两个操作,一是更改某个元素,而是索引某个区间的和,或者最大值,或者最小值,这里以和举例。最直接也是最简单做法当然是直接在数组上操作,O(1)更新,O(N)索引和,N为区间长度。 class NumArray { public NumArray(int[] nums) { //TODO } // Updates nums[index] = val. public void update(int index, int val) { //TODO} // Gets sum of nums[left ... right]. public int query(int left, int right) { //TODO } } 再进一步,如果需要更新一个区间的元素呢,比如,nums[i] += val, for left <= i <= right。 class NumArray { // ... // Add val, for nums[i] with left <= i <= right public int add(int left, int right, int val) { //TODO } } 继续以上方法,就是遍历区间,一一更新,与索引相同复杂度:O(N)。 有没有能索引复杂度的方法?有,线段树就是其中一种,通过构造线段树,可以将索引复杂度降低到O(logN), N为区间长度,但会增加update复杂度到O(logN),N为原数组长度。而一般的应用场景,读远大于写,所以这个改进非常有益于整体性能提升。 那如何构造线段树呢?可以通过数组表示,也可以通过一棵树来表示。先说数组表示法。 基于数组的线段树 给定的研究数组a[],长度为N,线段树用数组tree[]表示,则有: ...

March 4, 2021 · 4 min · 739 words

详解二叉索引树(Binary Index Tree又名Fenwick Tree)

何为BIT BIT是一种数据结构,名为树其实用数组即可表达。其主要作用是给定了一个数组(或多维数组),BIT可用来: O(logN)的复杂度计算给定范围的和,也就是求sum(low, high) O(logN)的复杂度更新数组的一个元素 其构造方法巧妙但简单,可谓大道至简。 给定一个数组a[0 ... N-1],长度N。 BIT表示为indexes[0 ... N],长度N+1。indexes[i]=sum(g(i) ... i], g(i)=i-i&(-i),注意是左开右闭区间。i&(-i)得到的是一个数字的二进制表示中只留下最右的1后的数字,比如12=[1100],则12&(-12)只留下最右的1就成了4=[0100]。 注意到indexes[]从1开始计数,所以始终有indexes[0]=0。为了说明方便,在数组a[]前面添加了个元素0。 看图说话,如此构造后,原数组的和sum[0 ... i]就可以通过把能覆盖range[0 ... i]的indexes[]加起来得到。比如求sum[0 ... 14],步骤如下,辅以图示。 先加上indexes[14],覆盖了a[14], a[13]。 14=[1110],最右1留下后是2=[10],减去后得到12,加上indexes[12]。 12=[1100],最右1留下后是4=[100],减去后得到8,加上indexes[8]。 8=[1000],最右1留下后是8=[1000],减去后得到0,结束。 所以何为BIT,其本质是通过原数组构造一个新数组indexes[],其中每个元素表示原数组中一段连续子数组的和。范围求和就从线性遍历原数组(O(N)),变成了在indexes[]中快速查找(O(logN))能覆盖所求范围的那几个元素。 而更新则是与求和相反的一个过程,原数组一个元素a[i]更新后,首先需要更新indexes[i],然后要不停向上,更新每个覆盖了a[i]的indexes元素。比如,a[9]=2 -> a[9]=10,则有: 计算一个更新delta, delta=10-2,更新indexes[9]+=delta, 9=[1001],最右1留下后是1=[0001],加上后得到10,更新indexes[10] += delta。 10=[1010],最右1留下后是2=[0010],加上后得到12,更新indexes[12] += delta。 12=[1100],最右1留下后是4=[0100],加上后得到16,更新indexes[16] += delta。 代码如下: class BIT { private int[] indexes; private int[] nums; public BIT(int[] nums) { this.nums = nums; indexes = new int[nums.length + 1]; for (int i = 0; i < nums.length; i++) { add(i + 1, nums[i]); } } // Update: nums[i]=val. public void update(int i, int val) { int delta = val - nums[i]; add(i + 1, delta); nums[i] = val; } // Get sum[i ... j]. public int query(int i, int j) { return query(j + 1) - query(i); } // Get sum[0 ... i-1]. private int query(int i) { int sum = 0; while (i > 0) { sum += indexes[i]; i -= lowBit(i); } return sum; } // Add: nums[i] += delta. private void add(int i, int delta) { while (i < indexes.length) { indexes[i] += delta; i += lowBit(i); } } private int lowBit(int i) { return i & (-i); } } 两个变招 上面介绍的最基本的索引树支持两项操作,单点更新和范围索引(此处是求和)。在此基础上,可以稍加改变即可支持范围更新和单点索引,更进一步,可以支持范围更新和范围索引。 ...

March 3, 2021 · 5 min · 960 words

详解归并排序之应用(Merge sort)

归并排序基于分治思想,基本步骤是把数组一分为二,分别排序,然后将两个排序好的数组合并,合并两个排序好的数组可以在O(N)的复杂度完成。根据T(N) = 2T(2/N) + N,可以推导出时间复杂度为O(NlogN)。代码如下: public void sort(int[] nums) { mergeSort(nums, 0, nums.length - 1); } private void mergeSort(int[] nums, int start, int end) { if (start >= end) { return; } int mid = start + (end - start) / 2; // Step 1: sort nums[start ... mid]. mergeSort(nums, start, mid); // Step 2: sort nums[mid+1 ... end]. mergeSort(nums, mid + 1, end); // Step 3: merge two sorted sub-arrays. merge(nums, start, mid, end); } // nums[start ... mid] and nums[mid+1 ... end] are sorted, respectively. // Merge two sorted sub-array in O(end - start + 1). private void merge(int[] nums, int start, int mid, int end) { int len = end - start + 1; int[] sorted = new int[len]; int i = start; int j = mid + 1; int k = 0; while (i <= mid && j <= end) { if (nums[i] <= nums[j]) { sorted[k++] = nums[i++]; } else { sorted[k++] = nums[j++]; } } while (i <= mid) { sorted[k++] = nums[i++]; } while (j <= end) { sorted[k++] = nums[j++]; } for (k = 0, i = start; i <= end; ++i, ++k) { nums[i] = sorted[k]; } } 归并排序代码简单易懂,其厉害之处在于巧加应用,可降低问题复杂度,仅以几例说明。 ...

March 2, 2021 · 6 min · 1259 words

详解并查集(Union Find)

What’s Union Find? Union Find is useful in finding connected components. The idea behind UF is quite simple. You have N items initially. Each item forms its own group by setting parent to pointing to itself, i.e., parents[i] = i. Then you can union two items if they are connected or share some common properties in your problem definition. E.g., union(1,3) will connect item 1 and 3 and make their parents both pointing to the same item. You can continue union(x, y) and put more items into one group. A very common implementation of Disjoint Set Union. Here I applied two optimization: path compression and union by rank. The idea is to flatten the tree when find(x) is called so as to reduce the time complexity, which is O(logN) now. In reality, the amortized time complexity is a very small constant. Refer to wiki for more detailed discussion. ...

March 1, 2021 · 6 min · 1161 words

Clubhouse需要你解决的问题之三

书接前文,pm升到partner level后,打算再接再厉,把clubhouse推向全宇宙。 之前一个房间参考微信设置只能容纳500人,pm说500人太少了,应该能容纳5亿人。 dev一时间不知如何回应,只能拉出了张小龙:做产品不是要节制么。 pm:做产品没有一成不变的方法论,每个产品我们都要追问自己几个问题:你这个产品的底层逻辑是什么?顶层设计在哪?最终交付价值是什么?过程的抓手在哪?如何保证闭环?你比别人的亮点在哪优势在哪?你的思考和沉淀又是什么?你的独特价值在哪?(*) dev:独特价值在哪? pm:我们能容纳5亿人在一个房间,地球村我都嫌路远,我称之为地球房。 dev:可是我的技术能力可能实现不了这么高难度的服务,认真再想了下,以我的技术能力还是去掉可能二字。 pm:好,那就5000人吧,我已经让步了5个数量级了,可不能再讨价划价了哦。另外,考虑到你的技术能力,给你10%的冗余范围,就是你分布式环境下计数不用做到丝毫无误,高并发下从4500一下跳到5500再阻止人加入也可以,4500时就阻止人加入也可以。 dev:你是不是一开始就只想要5000人的房间。 pm:你知道你为什么还是Senior SDE么? dev:请指教 pm:这个功能deliver时,我跟你1:1 请问,如何完成这个功能,限制房间人数为5000人,超过5000人时自动阻止其余人加入。 *注: 此段子转自fenng

February 6, 2021 · 1 min · 17 words