Published 9月 03, 2018 by with 0 comment

習題23 - 字符串, 字節串, 字符編碼



先下載一個文件, 另存起來.
https://learnpythonthehardway.org/python3/languages.txt

用Notepad++打下列的程式碼,
另存為ex23.py. 我附上中文注釋方便好讀.
# 導入sys模組, 並帶入三個參數
import sys
script, encoding, error = sys.argv

# 定義函數main, 讀取每一行內容
def main (language_file, encoding, errors):
    line = language_file.readline ()

# 假如if結果為真, 則繼續下去. 若為假, 則跳出來.
    if line:
# 調用一個print_line涵數
        print_line (line, encoding, errors)
# 再調用一次main函數, 從頭來一次.
        return main (language_file, encoding, errors)

# 定義print_line函數.
def print_line (line, encoding, errors):

#strip 是把每行的/n去掉
        next_lang = line.strip ()

#encode 是把它編碼成字浮串
        raw_bytes = next_lang.encode (encoding, errors = errors)

#decode 是把它解碼成字節串
        cooked_string = raw_bytes.decode (encoding, errors = errors)
        
        print (raw_bytes, "<===>", cooked_string)

#打開languages.txt文件
languages = open ("languages.txt", encoding = "utf-8")

main (languages, encoding, error)

然後用Windows的cmd, 執行python打開它.
C:\Windows\System32>cd C:\Users\Peter\Desktop\Python\LP3THW
C:\Users\Peter\Desktop\Python\LP3THW>python .\ex23.py utf-8 strict
b'Afrikaans' <===> Afrikaans
b'\xe1\x8a\xa0\xe1\x88\x9b\xe1\x88\xad\xe1\x8a\x9b' <===> አማርኛ
b'\xd0\x90\xd2\xa7\xd1\x81\xd1\x88\xd3\x99\xd0\xb0' <===> Аҧсшәа
b'\xd8\xa7\xd9\x84\xd8\xb9\xd8\xb1\xd8\xa8\xd9\x8a\xd8\xa9' <===> العربية
b'Aragon\xc3\xa9s' <===> Aragonés
b'Arpetan' <===> Arpetan
b'Az\xc9\x99rbaycanca' <===> Azərbaycanca
b'Bamanankan' <===> Bamanankan
b'\xe0\xa6\xac\xe0\xa6\xbe\xe0\xa6\x82\xe0\xa6\xb2\xe0\xa6\xbe' <===> বাংলা
b'B\xc3\xa2n-l\xc3\xa2m-g\xc3\xba' <===> Bân-lâm-gú
b'\xd0\x91\xd0\xb5\xd0\xbb\xd0\xb0\xd1\x80\xd1\x83\xd1\x81\xd0\xba\xd0\xb0\xd1\x8f' <===> Беларуская
b'\xd0\x91\xd1\x8a\xd0\xbb\xd0\xb3\xd0\xb0\xd1\x80\xd1\x81\xd0\xba\xd0\xb8' <===> Български
b'Boarisch' <===> Boarisch
b'Bosanski' <===> Bosanski
b'\xd0\x91\xd1\x83\xd1\x80\xd1\x8f\xd0\xb0\xd0\xb4' <===> Буряад
b'Catal\xc3\xa0' <===> Català
b'\xd0\xa7\xd3\x91\xd0\xb2\xd0\xb0\xd1\x88\xd0\xbb\xd0\xb0' <===> Чӑвашла
b'\xc4\x8ce\xc5\xa1tina' <===> Čeština
b'Cymraeg' <===> Cymraeg

完成


最初發表 / 最後更新: 2018.09.03 / 2018.09.03

0 comments:

張貼留言