About Me

  • I work on things that interest me. In the past they were speech recognition, recommender system, large scale distributed system. Now I spend my time on infra and system level including eBPF, containers orchestration but not limited to that. I am also interested into projects in either Rust or Haskell
  • I am working at Meta. Not interested into the new opportunities at this moment.
  • My email address is "bWVAcGF1bG1lLm5nCg==", base64 encoded
  • If you would like to send encrypted message to me, my gpg key id is 5F61EAF0E257055A on keyserver.ubuntu.com
  • Open Source Projects

    jieba-rs

    Chinese Word Segmentation with HMM Implemented in Rust.

    cedarwood

    Efficiently-updatable double-array trie in Rust.

    logq

    Analyzing log files in SQL with command-line toolkit, implemented in Rust.

    t-digest

    t-digest is a data structure for on-line accumulation of rank statistics implemented in rust.

    zw-fast-quantile

    Zhang Wang Fast Approximate Quantiles Algorithm in Rust.

    kvs

    bitcask key-value store. implemented for PingCAP talent plan.

    go-pdqsort

    Golang implementation of pattern defeating quicksort

    calljj

    An esoteric language running with SECD machine. The name was based on a pun of a popular Taiwanese music MV and lisp's callcc. It is bascially Grass language with different symbols in use. It's implemented in Haskell

    elasticsearch_dsl

    Domain specific language for elasticsearch in python. An emerging trend in recent year for software, including mongodb, elasticsearch and Chef, is to expose an JSON interface to accept complex requests. They give up the traditional SQL query and adopt JSON as the text encoding of abstract syntax tree. Therefore, whenever you are making up a request to these services, you are actually hand coding an abstract syntax tree in JSON. Although it is flexible and easy to extend, it is also error prone and hard to maintain. A common solution for this is to write a Domain Specific Language. And with python’s language design, a naive and natural solution is to use Class to denote AST node and Visitor pattern to code-generate the underlying JSON.

    solrdump

    A Clojure library designed to read raw index from Solr. Both of Solr and Elasticsearch are built on Lucene. However, when you would like to migrate from one of the technology to the other. There isn’t any tool help you do so. Nwer version of Solr has dump handler built-in, but it is just for exporting raw Lucene index and it is not in JSON format. You could use install the Lucence jar and manipulate the raw index with Java API. An naive way is to write Java, but to enjoy the feeling of dipping into new technology. We decided to write it in Clojure. The dump format would be JSON.

    Talks

    Books

    Learn you a Haskell for Great Good Chinese Edition

    The Haskell learning resource in Chinese is very few as the time of 2011. As one of the initiators of #haskell.tw channel on freenode. I took over the unfinished first eight chapters translation from Fleuria, and continued to translate the last six chapters. The translation is licensed by CC-BY-SA-NC. The work has been viewed by thousands of people.