We’ve probably heard of, or even used Redis quite frequently as part of our infrastructure. Redis is everywhere, which can be a caching layer, a message queue, or even a memory database, whatever it is, performance is always the main objective of using Redis. …


In the previous blogpost, we’ve discussed the neediness of consensus protocol in the distributed computing era and the most commonly known 2-phase-commit algorithm. In the second installment of this series, Paxos — the consensus protocol will be explained in a simplified manner, about its mechanism, the differences in comparison with 2PC and how it can solve the outstanding issues that 2PC leaves over.


Let’s review what are the problems that the algorithm needs to address:

Consensus protocol liveness property:

  • For many values proposed at the same time, only one is chosen.
  • When the value is chosen, all the nodes eventually…


You may have heard of Transaction — which is very important feature to maintain atomicity of the database. For those who dont know, atomicity guarantees updates to the database either all occur, or nothing occurs.

On a single node database, transaction is usually implemented by the storage engine, using a well-known technique called binary log (MySQL), transaction log (Oracle)… However, things become complicated when multiple nodes come into play. When a distributed transaction is commited independently on all of the nodes, there will be a case that the commit succeeds on some nodes and fails on others, which violates the…

Why Zero-copy?

What’s happening under the hood when the OS is copying a file / transfering a file to another host? For our naked eyes the process can be simple, OS first reads content of the file, then writes it to another file, then it’s done! However, things become complicated when we look more closely and memory is taken into account.

As depicted in the dataflow below, the file read from disk must go through kernel system cache — which resides in the kernel space, then the data is copied to userspace’s memory area before being written back to a new file…

We have several Kafka clusters of growing importance at CocCoc. As usage scales up, we would like to get a handle on the dataflow components. We decided to enforce producer/consumer ACLs on them to ensure data provenance, using Kafka built-in Authorizer which stores settings in Zookeeper.

Starting points

  • Kafka cluster with PLAINTEXT (port 9092) and SASL_PLAINTEXT (let’s say port 9093) listeners for clients, and broker interconnect using PLAINTEXT
  • Kafka version ≥ 2 for very useful prefixed ACL support in KIP-290
  • Firewall on production clusters, restricting Kafka ports (both because subnet-based allow-hosts is not implemented yet, and to reduce search-space for running services)

Coccoc uses KVM (Kernel-based Virtual Machine) virtualization infrastructure as a private cloud solution, and improving its performance and stability is always a challenging yet exciting task. In this blogpost, we will go through a brief description of cpu flags, and see how a little tweak on CPU flags can improve quite significant performance on a virtual machines for particular situations.

CPU flags

CPU flags (aka CPU features) simply are attributes of CPU, denoted which features it supports. The features can be simple as calculating floating point unit, hyperthreading technology, or ability to extend physical memory to 64-bit address…

Notable CPU flags (Intel…


For a very long time, the infrastructure at Coccoc only supports traditional services running on physical hosts. Recently on the rise of microservices, the trendy container orchestration technology K8s has been adapted as a big part of the grand design. Therefore, the infrastructure also needs a huge shift to not only support conventional but containerized microservices as well. In this article we’re gonna go through how load-balancing is being implemented to publish K8s services from a very old-fashioned setup.

Load-balancing infrastructure

simple traditional LB infrastructure

As depicted above, the primitive infrastructure for load-balancing architecture is straightforward:

  • 4 LBs are Internet-facing servers which utilizes master-backup VRRP (Virtual…


Many of us have heard of Thin/Thick provisioning — the very well-known features advertised widely on the market. In this blogpost we’re not only going to explain what are they, but we will dig deeper into what’s going on in the system for the features, with a few simple lines of C code, by our own hands! There’re even some interesting facts about the storage that we’re using day by day. Let’s go on.


For system admins, devops, developers… And all of those who are studious, curious and want to learn more about small things working under the hood.

First, What is Disk Provisioning?



Managing secrets / sensitive entities is quite a troublesome yet challenging task. There’re tremendous number of developers out there are still storing plaintext passwords inside collaborating platform such as gitlab, or keeping credentials inside Docker images which is accessible by anyone. At Coccoc, we’re using self-hosted Gitlab and Docker Harbor — locally opened — which helps mitigate the leak of secrets to the outside, it is, however, not a real solution to address this kind of problem.

Because the temptation to put secrets directly to git, or hardcode credentials in the source code is so huge, we procrastinated to restrict…

CocCoc Techblog

From engineers who’re devoting their time to the Vietnamese browser and search engine.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store