Python ascii() built-in function

The Python ascii and Python repr functions perform similar function. In this post, we will discuss the ascii() function in detail and also see the the differences between ascii function and repr function in Python.

i)Python ascii()

The purpose of Python ascii() function is to return string consisting of only ASCII characters.If any non-ascii characters are found then it is replaced with either a Unicode value or hexadecimal number using \x, \u or \U escape.

The Unicode number is represented using the escape sequence \u or \U and the Hexadeciaml value is represented using the \x escape.

>>> st="New string"
>>> ascii( st )
"'New string'"
>>> uni='Deutsche'
>>> ascii( uni )
"'Deutsche'"

In the example above there is no non-ascii characters so all the string are returned normally by the ascii().Let’s look at the example below where Unicode characters are used in a string.

>>> uni='jərmən'
>>> ascii( uni )
"'j\\u0259rm\\u0259n'"

If you look at the string returned by ascii() the characters ‘ə’ has been replaced with the value ‘\\u0259’.Note the Unicode number 259 stands for the character ‘ə’ in Unicode system.Look at the table given here https://en.wikipedia.org/wiki/List_of_Unicode_characters,search for the Decimal number 601, it has the Unicode number 259 and the coressponding character as ‘ə’.

Let’s look at another example using Japanese and Russian script characters.

>>> #Testing Japanese characters
>>> jp='見える'
>>> ascii( jp )
"'\\u898b\\u3048\\u308b'"
>>> #Testing Russian characters
>>> rs='смотреть'
>>> ascii( rs )
"'\\u0441\\u043c\\u043e\\u0442\\u0440\\u0435\\u0442\\u044c'"

Try checking each of the Unicode number given in the output against the Unicode characters you will see that each of them matches the Japanese or the Russian character.

Note the same concept also applies to list or tuple or set or dictionary data type, consider the example below.

>>> #Testing in list
ls=['見える' , 'New' , 'jərmən']
>>> ascii( ls )
"['\\u898b\\u3048\\u308b', 'New', 'j\\u0259rm\\u0259n']"
>>> #Testing in dictionary
>>> dic={ 23:'вид', 200:'внешность'}
>>> ascii( dic )
"{23: '\\u0432\\u0438\\u0434', 200: '\\u0432\\u043d\\u0435\\u0448\\u043d\\u043e\\u0441\\u0442\\u044c'}"


Difference between ascii() and repr()

The Python repr() function returns the Unicode or non-Unicode characters as it is.The relation between repr() and ascii() being that any Unicode characters returned by repr() in the string as it is will be replaced by the Unicode value or hexadecimal value when pass to ascii() function.

>>> st="New"
>>> repr( st )
"'New'"
>>> repr( st +st ) #Work fine
"'NewNew'"
>>> repr( '看'+'看' )
"'看看'"
>>> ascii( '看'+'看' )
"'\\u770b\\u770b'"

repr() returned the characters as it is but ascii() replaced them with Unicode value.

More built-in functions here : Python Built-in functions




Leave a Reply

Your email address will not be published. Required fields are marked *