"Redis development and operation and maintenance" notes

一, redis features1, (1) fast: C / memory / single-threaded architecture; (2) rich features: build expired / subscription release / Lua script creation New command/simple transaction/pipeline function; (3) persistence: RDB/AOF; (4) high availability: master-slave replication/sentinel/cluster. 2. The external data structure is implemented by multiple internal codes, and the internal coding is improved without external perception. Different codes play their respective advantages in different scenarios. 3. Single-threading is chosen because of pure memory access, non-blocking I/O, avoiding thread switching, and competing consumption, so redis is geared toward fast execution scenarios.

二, API understanding and use 1, data type (1) string Internal encoding: int (8 bytes long integer), embstr (less than or equal to 39 bytes), raw (greater than 39 bytes) Scenario: caching function, counting, sharing session, using expiration time rate limit (2) Hash Internal coding: ziplist (the number of fields is less than the configuration number and all values ​​are less than the configured number of bytes, the memory is relatively slow to read and write), and the hashtable hash table (read and write fast memory) Scene-storage object comparison: Each field stores a key. The advantage is simple and intuitive. Each field can be updated directly. The disadvantage is that it takes too many keys, consumes a lot of memory, and has poor cohesiveness of object information. Serialization, advantages are simple to program and improve. Memory usage, shortcomings Serialization has overhead, can not directly update a single field; hash type, the advantages of reasonable use can reduce memory usage, the disadvantage is to control the internal encoding conversion, otherwise it will consume more memory. (3) list Internal encoding: ziplist, linkedlist Scenario: Message Queue, List Data (4) Collection Internal encoding: intset (all elements are integers and the number is less than the configured number), hashtable Scene: collection operation, random number (5) Ordered collection Internal encoding: ziplist, skiplist Scene: Leaderboard 2, key management (1) traversal key Keys, when redis contains a large number of keys, keys are likely to cause redis blocking, so they are generally used under key or can be used in slave nodes. Scan, progressive traversal, returns a cursor value that is the next traversal of the starting point and the currently traversed key, although the blocking problem is solved but there is no guarantee that the traversal will be correct when the key changes. 3, other functions (1) Slow query, only the command execution time, does not include network transmission and command queuing time. The slow query log exists in memory. If there are many slow queries, it is necessary to store it periodically. (2) Simple transaction, does not support the rollback feature. (3) The pipeline executes commands in batches, effectively saving RTT (round trip time), but it is non-atomic. (4) Lua scripts create atomic, efficient, and custom command combinations. (5) Simple publish/subscribe function.

三,persistence1, RDB Description: RDB is a compact compressed binary file that saves snapshots of current process data to the hard disk. Manual trigger: save will block and is obsolete; bgsave creates a child process to generate RDB files. Automatic trigger: save related configuration, when the slave node is full copy, execute the debug reload command, and do not enable the AOF to execute the shutdown command. The main thread is blocked: the fork blocking time is related to the amount of redis process memory and the system. Advantages and Disadvantages: The advantages are full copying and fast recovery. The disadvantage is that it cannot be executed in real time, and the RDB file version compatibility may be problematic. 2, AOF Description: AOF appends each write command to aof_buf in a separate log, then synchronizes to the hard disk according to the policy, and periodically rewrites the command to compress. Rewrite trigger: Execute the bgrewriteaof command, according to the configured trigger timing. When rewriting, the original aof process is unchanged, and add aof_rewrite_buf to save new data, create a child process to write a new aof file according to the memory snapshot, and finally write the aof_rewrite_buf data to the new aof file and replace the old file. Main thread blocking: aof additional blocking indicates that the hard disk resources are tight. Advantages and Disadvantages: The advantage is real-time recording, the disadvantage is that the recovery is relatively slow, the aof file may be wrong, you can back up redis-check-aof --fix repair, then diff -u contrast lost data, artificial completion. 3, other When multiple instances are deployed on a single machine, to prevent multiple sub-processes from performing rewrite operations, it is recommended to perform isolation control to avoid CPU and IO resources from competing. 四,堵1, internal cause (1) The use of API or data structure is unreasonable, such as keys *, query large objects. (2) CPU saturation problem, which means that redis uses the single-core CPU usage rate of nearly 100%. First, it determines whether the concurrent quantity reaches the limit. Otherwise, it is not normal. It may use a high algorithm complexity command or excessive memory optimization to cause the operation to be slow. More CPU consumption, such as excessive release of ziplist usage conditions, and ziplist operation algorithm complexity between O (n) to O (n * n). (3) Persistent related blocking, including fork blocking, AOF brush blocking, and HugePage write blocking. 2, external reasons (1) CPU competition, redis is a typical CPU-intensive application. It is recommended that redis is not deployed with other CPU-intensive services, and the redis process is bound to the CPU. (2) Memory swapping refers to the system replacing some of the memory used by Redis to the hard disk. This will cause a sharp drop in performance. It is recommended to ensure sufficient memory, set the maximum available memory of redis, and reduce the system's use of swap priority. (3) Network problems.

Fifth, understand the memory1, memory consumption (1) Object memory, the largest block of redis memory, storing all data of the user. (2) Buffer memory, including client-side buffer (tcp connection), copy back buffer (master node), AOF buffer. (3) Memory fragmentation, frequent update operations, and large number of expired key deletions tend to cause fragmentation rates. Try to use digital types or fixed-length strings to reduce fragmentation and safely restart nodes to rearrange fragments. (4) Subprocess memory consumption. 2, manage memory (1) Set the upper limit of redis memory usage. (2) Memory recovery strategy, delete expired key (lazy delete expired key and timed delete expired key), memory overflow control policy (default reject write, LRU algorithm deletes the key that sets the expiration time, randomly deletes all keys, randomly deletes the expired key According to the key value, the ttl attribute deletes the data that is about to expire. 3, memory optimization (1) Reduce the key value object. (2) Shared object pool, which refers to the integer object pool of redis internal maintenance [0-9999], so the data is considered to use integers first. (3) string optimization, redis self-implementation string, there is a pre-allocation mechanism, so try to reduce the string frequent modification operations (append, setrange), directly use set, reduce the memory waste and memory fragmentation caused by pre-allocation. (4) Encoding optimization, controlling the corresponding internal coding type. (5) The number of control keys.

六, cache design1, consider the benefits and costs of caching (1) Revenue: Accelerate reading and writing and reduce back-end load. (2) Cost: data inconsistency, code maintenance cost, redis operation and maintenance cost. 2, cache update strategy (1) Strategy: LRU/LFU/FIFO algorithm culling, timeout culling, and active update. (2) Recommendation: Low-consistency services can be configured in the form of maximum memory and elimination policies. High-consistency services can be combined with timeout culling and active update. 3, cache granularity control 4, penetration optimization (1) Cache empty objects to prevent the back-end load from being increased when querying data that does not exist in a data layer. To prevent empty objects from occupying memory, an expiration time can be set. (2) Bloom filter interception. 5, bottomless optimization (1) Bottomless hole means that more nodes do not represent higher performance, because the network overhead of one batch operation will increase, and the number of network connections will have a certain impact on node performance. (2) Optimize batch operation mode: serial command, serial IO, parallel IO, hash_tag. 6, the avalanche problem (1) Ensure that the cache layer is highly available. (2) The customer service department is downgraded. 7, hot key problem (1) The hotkey key has a large amount of concurrency. After the expiration, the rebuild cache cannot be completed in a short time, which may cause a large number of threads to rebuild the cache and increase the backend load. (1) Mutexes (allowing only one thread to rebuild the cache), "Never expire" can solve the problem to some extent, developers need to understand their own use costs.

七, linux configuration optimization1, memory allocation control, set vm.overcommit_memory=1, indicating that the kernel allows excessive use of memory until it is used up. 2, swap control, set vm.swappiness, you can view the swap usage in real time, to prevent the system from converting the redis memory part into swap or redis directly killed when the memory is insufficient. 3. Disable Transparent HugePage. Otherwise, after the fork operation, each memory page will change from 4KB to 2MB, which will increase the rewrite memory consumption. At the same time, the copy memory page caused by each write command is enlarged by 512 times, causing the write operation to be slow. 4. Set ulimit -n The maximum number of files that the current user can open at the same time (the redis connection belongs to the file handle), which can increase the maximum number of redis connections. 5. Set tcp-backlog to be greater than the tcp-backlog of redis.