02python Chinese garbled

  • 言之理理, but everything is a process of evolution; Moreover, "big and full" is not necessarily good; tractors pull a lot of goods; can you drive to work? Take a closer look at the development of "character encoding" The computer was first invented in the United States. English mainly includes 26 letters (case), 10 numbers, punctuation, control characters, etc. Therefore, the ASCII character encoding is finally formulated, which maps the relationship between characters and character codes. And use the last seven digits of a byte (0 - 127) to store; (At the time, TM did not expect, the computer would be popular) Slowly popularize the computer to other parts of Western Europe, and found that many characters could not be identified; therefore, the ASCII was extended, called EASCII encoding; or one byte, from 128 – 255; but for the expansion of this block, each manufacturer has Their own standards (such as the relatively famous CP437 at the time); finally led to no communication with each other; therefore, later, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) jointly developed a standard ISO/8859-1 ( Latin-1), inherits 128-159 of CP437; redefines 160-255; Then after going to China, all of them were forced; the Chinese characters were profound and profound, and one byte was definitely not enough; so, the Chinese made a GB2312 to store Chinese, 6763 Chinese characters; (double-byte, ASCII-compatible) However, it was very cool at the beginning. Later, I found that there are also traditional characters, Tibetan, Mongolian, Uighur... I forced X2; So I made a heart, and made a GBK, all of them to get in; China is done, then Japan, South Korea... If at that time, each has its own character code, how should it communicate? For example, 666, in China, NB; in the island country, SB, it is a mess; therefore, the United Alliance International Organization, proposed Unicode encoding; covers all the text in the world, each character has a corresponding unique character code, this time everyone is happy, but for each character code, use a few bytes of storage problem There are several different specific solutions; for example, utf-8, utf-16, utf-32... So, in fact, when we talk about encoding here, it means Unicode encoding.