Floating point data type : Built-in data type.


floating point data type

Floating point data type is a built-in data type in C++ like the integer data type.This type like the integer data type can represent a number,but with fraction.So floating point data type can represent a real number.In the program below one of the floating point data types :float data type is utilized to show that floating point type can represent a decimal point value.Note there are two more floating point data type which we will discuss later.

 

#include <iostream>

using namespace std ;

int main( )
{
float f=12.3458976 ;  ///assigning real number to float variable
int i=90.567 ;

cout<< f << endl
<< i << endl ;

cin.get() ;
return 0;
}

 
The output is,
12.3459
90

In case of int variable the fractional value is neglected.However,for float variable up to four decimal point value is taken into account.Since int type does not support fractional value,when a floating point value is assigned to integer data type variable the fractional value is lost,the example above proves it.Another example is given below.

*Link:visit this link to know why integer type does not support decimal point value but floating point type does.

float f1=9078.87964 , f2 ;
int i1=4577 , i2 ;

i2=f1+i1 ;
f2=i1+f1 ;

cout << i2 << ” ” << f2 << endl ;

 
The output is,

13655 13655.9

The fractional value is preserve in case of float variable but for int variable it is lost.So,it is always better to use floating point data type in a program involving mathematical evaluation.





Types of floating point data type.

Floating point data type can be divided into three types:

i)Float
ii)Double and
iii)Long double.

The difference between the three types is shown later in this post.Unlike the int type floating point type cannot be further divided into signed and unsigned type.So type like signed float or unsigned double is invalid.

When very large or very small value is assigned to floating point data type variable the resulting output is in power to base 10 format.The program below demonstrates this concept.

#include <iostream>

using namespace std;

int main( )
{
float ff=.987 , ff1=5678987654 ;
double df=-9345 , df1=6778889;
long double ld=898 , ld1=-8999766 ;

cout << ff << “,” << ff1 << endl
  << df << “,” << df1 << endl
  << ld << “,” << ld1 << endl ;

cin.get( ) ;
return 0 ;
}

 
The output is,

0.987 , 5.67899e+009
-9345 , 6.77889e+006
898 , -8.99977e+006

In the above output e stands for 10 .Since floating point type is represented in power to base 10 format it allows this type to represent a number as small as 3.3621e-4932 and a number as large as 1.18973e+4932.You will see later the ranges of value different floating point types can represent.


 


A legit program,earn money($$$) from your website-Join Now!

Difference between float,double and long double

The difference between float , double and long double is characterized mainly by the different storage size each type allocate in the memory and also the number of precision value they can represent.Double can represent a double precision value comparing to float type while long double can represent more precision value.Float has a size of 4 bytes ,double has a size of 8 bytes and long double has a size of 12 bytes. A program below prints out their sizes and the minimum and maximum value they can represent.Include the library <cfloat> to get the minimum and maximum value of each type.

*Link:visit to know more about precision and accuracy in floating point data type.

#include <iostream>
#include <cfloat>

using namespace std;

int main( )
{
cout<< “Float size ” << sizeof(float) << endl ;
cout<< “Double size ” << sizeof(double) << endl ;
cout<< “Long double size ” << sizeof(long double) << endl ;

/**Minimum value of float,double and long double **/
cout<< “\n **Minimum value \n” ;
cout<< “Float min value ” << FLT_MIN << endl ;
cout<< “Double min value ” << DBL_MIN << endl ;
cout<< “Long double min value ” << LDBL_MIN << endl ;

/***Maximum value of float ,double and long double **/
cout<< “\n **Maximum value \n” ;
cout<< “Float max value ” << FLT_MAX << endl ;
cout<< “Double max value ” << DBL_MAX << endl ;
cout<< “Long double max value ” << LDBL_MAX << endl ;

cin.get( ) ;
return 0 ;
}

If you run the program you will see that,
float can hold the value ranging from 1.17549e-038 to 3.40282e+038 ,
double value range from 2.22507e-308 to 1.79769e+308 and
long double value range from 3.3621e-4932 to 1.18973e+4932 .The range of value obtained from the program is only for positive value this does not mean floating point type cannot represent a negative value,yes they can.The absolute value(meaning the magnitude) of the range of negative value each floating point type can hold is same as that with the absolute value of the +ve range value.The negative range is shown below.

The -ve value range of float is -1.17549e-038 to -3.40282e+038
For double the range is -2.22507e-308 to -1.79769e+308 and
for long double the range is -3.3621e-4932 to -1.18973e+4932 .

Since long double has the largest size it can represent the smallest and largest value among all the floating point type.


 




Why should you prefer double type over the other types?

Among all the floating point data type double is considered the best type for implementing a real number in our program.Here are some of the reasons why it is considered the best.

First::In our program the floating point literal value i.e. number with fractional value,is of double type by default.
 

cout<< 34.5657 << endl ; ///34.5657 is double type

cout<< typeid(34.5657).name() ; ///include the library <typeinfo> to use typeid operator

Since the fractional value is of double type using a fractional value with type other then double can result in an unexpected output.Consider the program below.

float a=1 , b=6 , c , result ;
c=a/b ;

result=(c – (1/6) ) ;

cout << result ;

 
The output is,
 
0.166667

The output should be 0 but it isn’t why?.The actual reason is due to the difference in the precision of double and float type(discuss more here).The value obtain from the evaluation of (1/6) is double type but c is a float type,this type difference gave rise to the unexpected output.If all the variables a,b,c and result were of double type the output would be 0 or casting the value 1 and 6 to float type would also give the result value as 0.But it is more secure to use double whenever we require fractional value because evaluation using floating point literal directly(like 1/6) is bound to happen in our program.

Second::The second reason is size.Float size is rather small so it might not be able to securely represent all the needed number for mathematical evaluation.What about long double? yes,long double can represent all the required number but it’s size is too large and so it might use up more resources than necessary.The perfect choice is double type,it can hold the necessary value(large or small) and it’s size is smaller than long double,so there is no fear of using up unnecessary resources.

Third::The third reason is optimization.Some machines and compiler provide optimization for calculation using double type.It is always better to take advantage of this optimization if you want to built a faster program.

We have seen what a floating point data type has in store for us,and also we have seen the different floating point data type and their range of values.No doubt,this type is more preferable over the integer data type due to the preservation of the fractional value,which other wise would not yield an accurate value.And lastly we have also seen why double is better than the other two types,so use double whenever you can cause it is worth it in every way.


Related Link

->Float or floating point internal format(IEEE 754 format)(explains why float supports decimal point value while integer data type doesn’t).
 
->Precision and accuracy in float