NumPy Tutorial NumPy Statistics NumPy References

NumPy - Data Types



NumPy supports a much greater variety of numerical types than Python does. Below is the list of most commonly used scalar data types defined in NumPy.

Data TypeDescription
bool_ Boolean (True or False) stored as a byte.
short 16-bit signed integer; Same as C short.
intc 32-bit signed integer; Same as C int.
int_ 64-bit signed integer; Same as Python int and C long.
int8 Byte (-128 to 127).
int16 Integer (-32768 to 32767).
int32 Integer (-2147483648 to 2147483647).
int64 Integer (-9223372036854775808 to 9223372036854775807).
uint8 Unsigned integer (0 to 255).
uint16 Unsigned integer (0 to 65535).
uint32 Unsigned integer (0 to 4294967295).
uint64 Unsigned integer (0 to 18446744073709551615).
intp Integer used for indexing, typically the same as ssize_t.
float_ Same as float64.
float16 16-bit-precision floating-point number type: sign bit, 5 bits exponent, 10 bits mantissa.
float32 32-bit-precision floating-point number type: sign bit, 8 bits exponent, 23 bits mantissa.
float64 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa.
double 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa. Same as Python float and C double.
complex_ Same as complex128.
complex64 Complex number, represented by two 32-bit floats (real and imaginary components).
complex128 Complex number, represented by two 64-bit floats (real and imaginary components).

NumPy numerical types are instances of dtype (data-type) objects, each having unique characteristics. Once NumPy is imported using:

import numpy as np

the dtypes are available as np.bool_, np.float32, etc.

Data Type Objects (dtype)

A data type object describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted. It describes the following aspects of the data:

  • Type of the data (integer, float, Python object, etc.)
  • Size of the data
  • Byte order of the data (little-endian or big-endian)
  • If the data type is structured data type, the names of fields, data type of each field and part of the memory block taken by each field.
  • If data type is a subarray, its shape and data type

The byte order is decided by prefixing < or > to data type. '<' means that encoding is little-endian (least significant is stored in smallest address). '>' means that encoding is big-endian (most significant byte is stored in smallest address).

A dtype object is constructed using the following syntax:

numpy.dtype(object, align, copy)

Parameters

object Required. Specify the object to be converted to data type object.
align Optional. If true, adds padding to the field to make it similar to C-struct.
copy Optional. Makes a new copy of data-type object. If False, the result may just be a reference to a built-in data-type object.

Example:

The example below shows how to create structured data-type.

import numpy as np

# using array-scalar type 
dt1 = np.dtype(np.int32) 
print("dt1:", dt1)
dt2 = np.dtype(np.float32) 
print("dt2:", dt2)

#int8, int16, int32, int64 can be replaced 
#by equivalent string 'i1', 'i2', 'i4', 'i8'
dt3 = np.dtype('i4') 
print("dt3:", dt3)

#similarly, float8, float16, float32, float4 can be 
#replaced with string 'f1', 'f2', 'f4', 'f8'
dt4 = np.dtype('f4') 
print("dt4:", dt4)

The output of the above code will be:

dt1: int32
dt2: float32
dt3: int32
dt4: float32

Example:

The example below shows how to create structured data-type using endian notation.

import numpy as np

dt1 = np.dtype('>i4') 
print("dt1:", dt1)
dt2 = np.dtype('<i4') 
print("dt2:", dt2)

The output of the above code will be:

dt1: >i4
dt2: int32

Example:

In the example below, structured data-type vertex is created with integer fields 'x' and 'y'. After that it is applied on a ndarray object Arr.

import numpy as np

vertex = np.dtype([('x','>i4'), ('y', '>i4')]) 
Arr = np.array([(10, 20),
                (10, -20),
                (-10, 20),
                (-10, -20)], dtype = vertex)

#printing data type
print(vertex)

#printing (x, y) co-ordinates of Arr
print("\nArr contains:")
print(Arr)

The output of the above code will be:

[('x', '>i4'), ('y', '>i4')]

Arr contains:
[( 10,  20) ( 10, -20) (-10,  20) (-10, -20)]

Example:

Consider one more example, structured data-type student is created with string field 'name', int field 'age' and float field 'marks'. After that it is applied on a ndarray object Arr.

import numpy as np

student = np.dtype([('name','S30'), ('age', 'i4'), ('marks', 'f4')]) 
Arr = np.array([('John', 25, 63.5),
                ('Marry', 24, 75),
                ('Ramesh', 24, 81),
                ('Kim', 23, 67.5)], dtype = student)

#printing Arr
print("Arr contains:")
print(Arr)

The output of the above code will be:

Arr contains:
[(b'John', 25, 63.5) (b'Marry', 24, 75. ) (b'Ramesh', 24, 81. )
 (b'Kim', 23, 67.5)]

Each built-in data type has a character code that uniquely identifies it. The first character specifies the kind of data and the remaining characters specify the number of bytes per item, except for Unicode, where it is interpreted as the number of characters. The item size must correspond to an existing type, or an error will be raised. The supported kinds are:

  • '?' − boolean
  • 'b' − (signed) byte
  • 'i' − (signed) integer
  • 'u' − unsigned integer
  • 'f' − floating-point
  • 'c' − complex-floating point
  • 'm' − timedelta
  • 'M' − datetime
  • 'O' − (Python) objects
  • 'S', 'a' − zero-terminated bytes
  • 'U' − Unicode string
  • 'V' − raw data (void)