The journey to becoming a developer

My future is created by what I do today, not tomorrow.

Computer Science/Crash Course

Representing Numbers and Letters with Binary: Crash Course Computer Science #4

Millie 2021. 10. 4. 14:47

Crash Course의 4번째 시간에는 2진법으로 어떻게 숫자와 문자들을 나타낼 수 있는지에 대해서 배운다.

Last episode

  • how transistors can be used to build logic gates, which can evaluate boolean statements
  • in boolean algebra, there are only two, binary values : true and false.
  • But if we only have two values, how in the world do we represent information beyond just these two values? That's where the Math comes in.

Bit

  • Each of these binary digits, 1 or 0, is called a 'bit'
  • 8bit : highest value is 255, lowest value is 0
    • 256 differenct values, or 2^8 (2 to the 8th power)
  • 8-bit computers, 8-bit graphics or audio : These were computers that did most of their operations in chunks of 8 bits.
  • But 256 different values isn't a lot to work with, so it meant things like 8-bit games were limited to 256 differenct colors for their graphics.

Byte

  • 8-bits is such a common size in computing, it has a special word : a byte. A byte is 8 bits.
  • kilobytes, megabytes, gigabytes - these prefixes denote different scales of data
    • 1 kilobyte is a thousand bytes.= 8000 bits
    • mega is a million bytes(MB), giga is a billion bytes(GB)
    • 1 terabyte(TB) = 8 trillion ones and zeros

But that's not always true!

  • In binary, a kilobyte has two to the power of 10 bytes, or 1024.
  • 1000 is also right when talking about kilobytes, but we should acknowledge it isn't the only correct definition.

32-bit / 64-bit computers

  • They operate in chuncks of 32 or 64 bits.
  • The largest number you can represent with 32 bits is just under 4.3 billion which is thirty-two 1's in binary.
  • This is why our Instagram photos are so smooth and pretty - they are composed of millions of colors, because computers today use 32-bit color graphics.

We need a way to represent positive and negative numbers.

  • Most computers use the first bit for the sign : 1 for negative, 0 for positive numbers, and then use the remaining 31 bits for the number itself.
  • That gives us a range of roughly plus or minus two billion.(20억) - While this is a pretty big range of numbers, it's not enough for many tasks.
  • The largest value a 64-bit number can represent is around 9.2 quintillion.

Addresses

Computers must label locations in their memory, known as addresses, in order to store and retrieve values.

Floating point numbers - the decimal point can float around in the middle or number

  • As computer memory has grown to gigabytes and terabytes - that's trillions of bytes - it was necessary to have 64-bit memory addresses as well.
  • Several methods have been developed to represent floating point numbers.
  • The most common of which is the IEEE 754 standard.
    • In essence, this standard stores decimal values sort of like scientific notation.
    • For example, 625.9 can be written as 0.6259 x 10^3.
      • 0.6259 : significand (유효 숫자)
      • 3 : exponent (지수)

 

 

 

 

 

 

 

 

 

 

 

In a 32-bit floating point number

The first bit is used for the sign of the number - positive or negative.

The next 8 bits are used to store the exponent.

The remaining 23 bits are used to store the significand.

How does computers represent letters?

Rather than have a special form of storage for letters, computers simply use numbers to represent letters.

→ The most straightforward approach might be to simply number the letters of the alphabet.

ASCII : the American Standard Code for Information Interchange

  • Invented in 1963, ASCII was a 7-bit code, enough to store 128 different values.
  • It could encode capital letters, lowercase letters, digits 0 through 9, and symbols like the @ sign and puncuation marks.
  • ASCII even had a selection of special command codes, such as a newline character to tell the computer where to wrap a line to the next row.
  • Because ASCII was such an early standard, it became widely used, and critically, allowed different computers built by different companies to exchange data. This ability to universally exchange information is called "interoperability".

A major limitation of ASCII : it was only designed for English

  • 7bit가 아닌 8bit를 사용 - 128부터 255까지, 이전에는 국제적인 문자로 사용되지 않았던 코드가 곧 대중화되었다.
  • 미국에서 이러한 여분의 숫자들은 추가적인 기호들을 인코딩하는 데 사용되었다.
  • 수학적 개념이나 그래픽 요소들, 일반적인 악센트 부호 같은 기호들을 인코딩했다.
  • 각 나라들은 다양한 바이트의 인코딩 구조를 개발했으며, 이들은 서로 호환되지 않았다. 그래서 유니코드가 탄생했다.

Unicode

  • Devised in 1992 to finally do away with all of the different international schemes, it replaced them with one universal encoding scheme.
  • 유니코드의 가장 일반적인 버전은 16비트 공간을 사용해서 백만 개가 넘는 코드를 넣을 수 있다. 이것은 모든 언어에 있는 각각의 글자를 저장하기에 충분하다. 수학적 기호, 이모티콘도 다 포함할 수 있다.

And in the same way that ASCII defines a scheme for encoding letters as binary numbers, other file formats - like MP3s or GIFs - use binary numbers to encode sounds or colors of pixel in our photos, movies, and music.

Most importantly, under the hood it all comes down to long sequences of bits. Text messages, Youtube video, every webpage on the internet, and even your computer's OS, are nothing but long sequences of 1s and 0s.

Next episode : How computer starts manipulating those binary sequences, for our first true taste of computation.

Crash Course는 정말 10분 만에 방대한 컴퓨터에 관한 지식을 압축해서 알려준다. 그 많은 지식의 중요한 부분을 찝어서 요약해준다. 그래서 좀 더 세부적으로 알기 위해서는 다른 자료를 봐야 한다.

한글 자막의 오류와 매끄럽지 않은 해석으로 그냥 거의 영어로 정리하였다.

 

https://youtu.be/1GSjbWt0c9M