Java concurrency

The Fork/Join model in multi-threaded programming:

(1) Initial setup: The Main Thread
(2) Fork: Spawn new subtasks
(3) Parallel execution
(4) Join: consolidate results
(5) Repeat

Critical Section: shared resources or variables
Race Condition: multiple threads trying to do things.
Synchronization tools: coordinate threads with critical section. includes: Mutexes, Read/Write locks, Semaphores, Condition variables, and Barriers.

Read More

High level components

Will discuss through couple aspects, includes: Index, Replication, Failure detection and Consistency, the last parts will be some existing database example.

Index

A database index is used for speeding up reads based on a specific key. Index will slow down database write and speed up read.

Hash index

Hash index is kept in memory hash table of key mapped to the memory location of the data, occasionally write to disk for persistence. however, it works poorly on disk.

Pros: easy to implement and veryfast (RAM is fast)
Cons: all keys must fit in memory and it is bad for range queries.

Scenarios: it is fast but only useful on small datasets.

Read More

Block File and Object storage

Block storage: raw blocks attached to a server as a volume. mutable, higher cost and higher performance, however lower scalability casue it could only attached to one server and good for VMs and databases.
File storage: built on top of block storage, higher level of abstraction, handle files and directories, medium to high performance and cost, medium scalability, which provides general purpose file system access, good for sharing files/folders within organization.
Object storage: sacrifice performance for higher durability and vast scalability with low cost. it is generally immutable however version is supported. it targets relatively colder data, access is through Restful apis.

Requirement for object storage

This blog is more about object storage. It provides Restful Apis, includes PUT, GET object.
Business entities: bucket(folder) and object.

Read More

Why sampling?

Consider maintaining a highly concurrent service, and with 3k-5k requests per second hitting one server. this will generate a large number of request logs. Among all those request logs, normally data plane apis (data related) have a much larger volume than control plane apis (management related).

We want to understand how healthy the service is running, how healthy every apis are, note that a api with lower volume does not make it less important.

How do we do that? when the request logs data is large, it is often advantageous to choose a smaller subset of data which could summarize the original dataset, this is called sampling. Main idea is to take a statistically significant sample of data and then analyse this sample rather than having to use the whole original data set.

By querying sampling data, the system is able to provide a efficient result which is approximate to the real answer.

Read More

日本女性 新发癌症病例数排名:乳腺,大肠,肺。其中经常使用特别油腻,含有大量添加剂的视频不利于健康,触发大肠癌。多吃有益于肠道细菌繁殖的食材,比如富含膳食纤维的蔬菜水果和发酵食品有利于大肠健康。

癌是如何形成的?

1癌细胞生成:起始因子(initiator):人体细胞每天都在新陈代谢,在致癌物质,病毒感染,年龄增长等的作用下,细胞复制更容易出错。包括活性氧,化学物质,紫外线等。
预防:避免烟,辐射,紫外线等,服用抗氧化物质
2促进癌变:促癌因子(promoter):把变异细胞转化为癌细胞。包括病毒,脂肪和盐分
3癌细胞增殖:NK细胞(natural killer cell):在体内巡逻,攻击癌细胞

Read More

Cache helps on availability and resiliency by for example, improving request latency then service is more able to handle incoming traffic. as well as decrease load on downstream dependencies.

On the flip side, cache introduces modal behavior for your service, with differing behavior depending on whether a given object is cached.

Read More