C++ - <cfloat>

The C++ <cfloat> header file describes the characteristics of floating types. It contains platform-dependent and implementation-specific floating point values. A floating-point value consist of four parts:

Sign: It can be either negative or non-negative.
Base: It is also known as radix of the exponent representation, 2 for binary, 10 for decimal, 16 for hexadecimal, and so on...
Significand: It is also known as mantissa which is a series of digits of the base. The number of digits in this series is known as precision.
Exponent: It is also known as characteristic or scale, which represents the offset of the significand.

Based on the above four parts, a floating value can be expressed as follows:

value of floating-point = ± significand x base^exponent

C++ <cfloat> Macro Constants

The below mentioned macros are implementation-specific and defined with the #define directive. The table below shows the different macros in <cfloat> header and their minimal or maximal values in all implementations. In all instances it represents the following:

FLT: refers float
DBL: refers double
LDBL: refers long double
DIG: refers digits
MAX: refers maximum
MIN: refers minimum
MANT: refers mantissa

Macros

Description

FLT_RADIX

Radix (integer base) used by the representation of all three floating-point types (float, double and long double).
Minimum value is 2.

FLT_MANT_DIG
DBL_MANT_DIG
LDBL_MANT_DIG

Number of base FLT_RADIX digits that can be represented without losing precision for float, double and long double respectively.

FLT_DIG
DBL_DIG
LDBL_DIG

Number of decimal digits that can be rounded into a floating-point and back again to the same decimal digits, without loss of precision.
It is defined as AT LEAST 6, 10 and 10 digits for float, double and long double respectively.

FLT_MAX
DBL_MAX
LDBL_MAX

Maximum finite value of float, double and long double respectively.
Minimum value is 10³⁷.

FLT_MIN
DBL_MIN
LDBL_MIN

Minimum finite value of float, double and long double respectively.
Maximum value is 10^-37.

FLT_MAX_EXP
DBL_MAX_EXP
LDBL_MAX_EXP

Maximum positive integer such that FLT_RADIX raised by power one less than that integer is a representable finite float, double and long double respectively.

FLT_MIN_EXP
DBL_MIN_EXP
LDBL_MIN_EXP

Minimum negative integer such that FLT_RADIX raised by power one less than that integer is a normalized float, double and long double respectively.

FLT_MAX_10_EXP
DBL_MAX_10_EXP
LDBL_MAX_10_EXP

Maximum positive integer such that 10 raised to that power is a representable finite float, double and long double respectively.
Minimum value is 37.

FLT_MIN_10_EXP
DBL_MIN_10_EXP
LDBL_MIN_10_EXP

Minimum negative integer such that 10 raised to that power is a normalized float, double and long double respectively.
Maximum value is -37.

FLT_EPSILON
DBL_EPSILON
LDBL_EPSILON

Difference between 1.0 and the next representable value for float, double and long double respectively.
Maximum values are 10^-5, 10^-9 and 10^-9 for float, double and long double respectively.

FLT_ROUNDS

Default rounding behavior. Possible values are:

-1 : undetermined (default rounding direction)
0 : rounding toward zero
1 : rounding to nearest integer
2 : rounding toward positive infinity
3 : rounding toward negative infinity

FLT_EVAL_METHOD (C++11)

Specifies the evaluation format of arithmetic operations. Possible values are:

Value	Description
-1	undetermined (default precision)
0	All operations and constants evaluate in the range and precision of the type used. Additionally, float_t and double_t are equivalent to float and double respectively.
1	All operations and constants evaluate in the range and precision of double. Additionally, both float_t and double_t are equivalent to double.
2	All operations and constants evaluate in the range and precision of long double. Additionally, both float_t and double_t are equivalent to long double.
Other negative values: Implementation-defined behavior.

DECIMAL_DIG (C++11)

Number of decimal digits that can be rounded into a floating-point type and back again to the same decimal digits, without loss in precision.

FLT_DECIMAL_DIG
DBL_DECIMAL_DIG (C++17)
LDBL_DECIMAL_DIG

FLT_TRUE_MIN
DBL_TRUE_MIN (C++17)
LDBL_TRUE_MIN

Minimum positive value of float, double and long double respectively.

FLT_HAS_SUBNORM
DBL_HAS_SUBNORM (C++17)
LDBL_HAS_SUBNORM

Specifies whether the type supports subnormal (denormal) numbers:

-1 : indeterminable
0 : absent
1 : present

Example:

The example below describes the working of macros constant of C++ <cfloat> library.

#include <iostream>
#include <cfloat>
using namespace std;

int main (){

  cout<<"FLT_RADIX: "<<FLT_RADIX<<"\n";
  cout<<"FLT_DIG: "<<FLT_DIG<<"\n";
  cout<<"FLT_MAX: "<<FLT_MAX<<"\n";
  cout<<"FLT_MIN: "<<FLT_MIN<<"\n";
  cout<<"FLT_MAX_EXP: "<<FLT_MAX_EXP<<"\n";
  cout<<"FLT_MIN_EXP: "<<FLT_MIN_EXP<<"\n";
  cout<<"FLT_MAX_10_EXP: "<<FLT_MAX_10_EXP<<"\n";
  cout<<"FLT_MIN_10_EXP: "<<FLT_MIN_10_EXP<<"\n";
  cout<<"FLT_EPSILON: "<<FLT_EPSILON<<"\n";
  cout<<"FLT_ROUNDS: "<<FLT_ROUNDS<<"\n";
  cout<<"FLT_EVAL_METHOD: "<<FLT_EVAL_METHOD<<"\n";

  return 0;
}

The output of the above code is machine dependent. One of the possible output could be:

FLT_RADIX: 2
FLT_DIG: 6
FLT_MAX: 3.40282e+38
FLT_MIN: 1.17549e-38
FLT_MAX_EXP: 128
FLT_MIN_EXP: -125
FLT_MAX_10_EXP: 38
FLT_MIN_10_EXP: -37
FLT_EPSILON: 1.19209e-07
FLT_ROUNDS: 1
FLT_EVAL_METHOD: 0