Posts

Showing posts from May, 2023

something important about system design

Whenever we are designing a large system, we need to consider a few things: What are the different architectural pieces that can be used? How do these pieces work with each other? How can we best utilize these pieces: what are the right tradeoffs? In this article, we’ll go through Load Balancing, Caching, Data Partitioning, Indexes, Replication and Consistent Hashing. These are the basics that you need to know about system design. Key Characteristics of Distributed Systems Before diving into a specific section, we’ll discuss the key characteristics of distributed system include Scalability, Reliability, Availability and Manageability. Scalability  is the capability of a system, process, or a network to grow and manage increased demand. Any distributed system that can continuously evolve in order to support the growing amount of work is considered to be scalable. Horizontal scaling means that you can add more servers while Vertical scaling means that you scale by replacing the exist...

System Design vol.1

  最近在看 << grokking the system design interview >> ,一个专门为system design 面试写的教程,书中对面试中常见的场景,按照既定的模板进行了分析与解答,同时还给出了常见后端架构的知识点。 1. Step 1: Requirements clarifications  澄清需求         由于都是开放式问题,没有唯一的答案,所以在面试前期确定问题的范围,并澄清需求歧义的部分很重要。另外,面试中我们只有35-40min 来设计庞大的系统,所以需要搞清楚在整个系统设计中,哪些是我们需要专注的部分。 以设计一个类似Twitter的服务为例,在进入面试下一阶段前, 你需要弄清楚以下问题: 1. 用户是否可以发推,关注其他用户  2. 是否需要创建并展示用户的时间线 3. 推文是否需要包含视频、图片 4. 是否只需关注后端部分,还是也需要开发前端部分 5. 用户能否搜索推文 6. 是否需要展示热门趋势话题 7. 是否需要为热门/重要的推文推送通知消息 2. Step 2: System interface definition   定义系统接口   设计系统需要的API,保证我们的理解的需求是正确的。例如 postTweet(user_id, tweet_data, tweet_location, user_location, timestamp, …) generateTimeline(user_id, current_time, user_location, …) markTweetFavorite(user_id, tweet_id, timestamp, …) 3. Step 3: Back-of-the-envelope estimation 规模预估    为了后续的扩容、对数据进行分区、负载均衡、缓存数据,我们需要对系统的各方面进行预估。 系统的规模(用户数、Tweets 总数、浏览量,等等) 需要多少存储(是否包含图片和视频可能会使存储量大大不同) 流量(QPS 等决定了你需要多大的带宽,以及对后端存储的压力,如何进行负载均衡,等等) 4. Step...