You may have heard of Transaction — which is very important feature to maintain atomicity of the database. For those who dont know, atomicity guarantees updates to the database either all occur, or nothing occurs.
On a single node database, transaction is usually implemented by the storage engine, using a well-known technique called binary log (MySQL), transaction log (Oracle)… However, things become complicated when multiple nodes come into play. When a distributed transaction is commited independently on all of the nodes, there will be a case that the commit succeeds on some nodes and fails on others, which violates the atomicity guarantee. Therefore all the nodes need to be aware of each other, and together agree on a decision: commit or abort — what is called consensus protocol. Consensus is not only applied for atomic commit, but also other situation such as: leader election, or any operation required agreement from multiple nodes. …
What’s happening under the hood when the OS is copying a file / transfering a file to another host? For our naked eyes the process can be simple, OS first reads content of the file, then writes it to another file, then it’s done! However, things become complicated when we look more closely and memory is taken into account.
As depicted in the dataflow below, the file read from disk must go through kernel system cache — which resides in the kernel space, then the data is copied to userspace’s memory area before being written back to a new file — which then in turn goes to kernel memory buffer before really flushed out to disk. The procedure takes quite many unnecessary operations of copying back and forth between kernel and userspace without actually doing anything, and the operations consume system resources and context switches as well. …
We have several Kafka clusters of growing importance at CocCoc. As usage scales up, we would like to get a handle on the dataflow components. We decided to enforce producer/consumer ACLs on them to ensure data provenance, using Kafka built-in Authorizer which stores settings in Zookeeper.
Coccoc uses KVM (Kernel-based Virtual Machine) virtualization infrastructure as a private cloud solution, and improving its performance and stability is always a challenging yet exciting task. In this blogpost, we will go through a brief description of cpu flags, and see how a little tweak on CPU flags can improve quite significant performance on a virtual machines for particular situations.
CPU flags (aka CPU features) simply are attributes of CPU, denoted which features it supports. The features can be simple as calculating floating point unit, hyperthreading technology, or ability to extend physical memory to 64-bit address…
Notable CPU flags (Intel…
For a very long time, the infrastructure at Coccoc only supports traditional services running on physical hosts. Recently on the rise of microservices, the trendy container orchestration technology K8s has been adapted as a big part of the grand design. Therefore, the infrastructure also needs a huge shift to not only support conventional but containerized microservices as well. In this article we’re gonna go through how load-balancing is being implemented to publish K8s services from a very old-fashioned setup.
As depicted above, the primitive infrastructure for load-balancing architecture is straightforward:
Many of us have heard of Thin/Thick provisioning — the very well-known features advertised widely on the market. In this blogpost we’re not only going to explain what are they, but we will dig deeper into what’s going on in the system for the features, with a few simple lines of C code, by our own hands! There’re even some interesting facts about the storage that we’re using day by day. Let’s go on.
For system admins, devops, developers… And all of those who are studious, curious and want to learn more about small things working under the hood.
Imagine, when you want to create a Virtual Machine (VM) with a 50GB disk, the VM service (KVM / VMware…) will create you a box, with requested specifications: CPU, memory, disk… Providing CPUs to the VM is fast, so is memory. What about disk? Depending the mechanism that the service is using, it will give you either…
Managing secrets / sensitive entities is quite a troublesome yet challenging task. There’re tremendous number of developers out there are still storing plaintext passwords inside collaborating platform such as gitlab, or keeping credentials inside Docker images which is accessible by anyone. At Coccoc, we’re using self-hosted Gitlab and Docker Harbor — locally opened — which helps mitigate the leak of secrets to the outside, it is, however, not a real solution to address this kind of problem.
Because the temptation to put secrets directly to git, or hardcode credentials in the source code is so huge, we procrastinated to restrict this practice at the beginning in the fast-paced development environment. Over time, some adopted tech choices such as Ansible and Kubernetes actually support a kind of secret management for their solutions, respectively called Ansible-vault or Kubernetes-secret, which help managing secrets easier. Story has not ended, another problem arose, the secrets are controlled in many places, even duplicated, and there’s no unified way for services to retrieve secrets. …
About