C++ Why is char type considered as the smallest int type?
char data type is considered as the smallest int/integer data type why?.
i)char type has the same internal format as the integer type.Basically what this means is the format of storing the char bits and integer bits are the same;they follow the base-2 binary format.
ii)Due to the above reason the char type is further sub-divide it into two types :signed char and unsigned char data type like the integer type.
If we literally consider a char type as an integer type then it can represent only 256(28) integers,this in turn will mean a char type can represent only 256 characters because an 8 bits can represent only 256 unique values;each character for each unique value.
Link :Character data type
In our program we can utilize as many unique integer value as we desire- although it shouldn’t be larger than the maximum value representable by long long type(8 bytes, 64 bits)- but the number of characters is limited to 256.
Our understanding of char type as smallest integer type make every characters interchangeable to integer type;in the sense that char type can represent an integer value and vice versa.So this allows every character to map to a unique integer value but the vice versa is not necessarily true because there are thousands of integers but only 256 characters to map to.So we can proudly state that many integers can point to the same character,not yet convinced? Look at the program below.
LInk :Integer data type
#include < iostream > using namespace std; int main( ) { char c1=36 , c2=56 , c3=78 , cc=292 ; /*integer value assigned to char */ char c4=’*’ , c5=’/’ , c6=’1′ ; int i1=c4 , i2=c5 , i3=c6 ; ///char assigned to int cout<< c1 << ” , ” << c2 << ” , ” << c3 << “,” << cc endl << i1 <<” , ” << i2 << “,” << i3 ; cin.get( ) ; return 0 ; }
The output is,
$ , 8 , N , $
42 , 47 , 49
If we look at the output,c1 and cc are assigned different integer value but they give the same character as the output,whereas in case of characters assigned to the int type variable(i1,i2,i3) all the integer output are different.
To know which integers will map to which character we must be acquaint with certain pattern the compiler utilized to map integers to a character .The pattern is:
A character would repeat itself after every 255(since 0 is included it is not 256) integer value.For instance 90 , 256+90 , (256*2)+90 , (256*3)+90 , … will have the same character value.
But how do I know the 256 characters used by the integer to map to.Worry not,the character and it’s corresponding integer value are given in a chart known as ASCII chart and the chart is shown below.
**Note:ASCII chart consist of only the first 127 characters(0 to 127).Here,the remaining 128 characters are also shown if in case you want to refer them in your program.
If we know the chart,conversion between integer and char type becomes much simpler.For instance in the program below,using the chart we will write a program to print out “Hello World!” using only integer value.
#include <iostream> using namespace std ; int main( ) { char double_quotation=34 ,H=72 , e=101 , l=108 , o=111 , W=87 , r=114 , d=100 , space=32 ,exclamation_mark=33; cout<< double_quotation << H << e << l << l << o << space << W << o << r << l << d << exclamation_mark << double_quotation << endl ; cin.get( ) ; return 0 ; }
Output,
“Hello World!”
Some points to note about char type
i)It is the smallest int type.
ii)It can represent only 256 characters consisting of English alphabets , 0-9 integers , special characters like ?,.$@(* etc. and some other characters.Now this means it cannot represent other languages script (say Russian,Hindi,Japanese,etc.).This problem will be solved by Unicode and wchar_t type.
The char data type can be treated as signed char type or unsigned char type depending on your machine and compiler.If you want to use a signed char type do not use ‘char’ type as it may be interpreted as signed type in some machine and unsigned in some other.To avoid any unexpected result from your program when running in different machine use signed char type explicitly if you need the signed type,in this way there is no chance for the machine to interpret it differently.The range of char data type however remains the same in all the machine.