Skip to content

Latest commit

 

History

History
75 lines (60 loc) · 2.17 KB

File metadata and controls

75 lines (60 loc) · 2.17 KB

NumPy Integration

PyArrow allows converting back and forth from NumPy arrays to Arrow :ref:`Arrays <data.array>`.

NumPy to Arrow

To convert a NumPy array to Arrow, one can simply call the :func:`pyarrow.array` factory function.

>>> import numpy as np
>>> import pyarrow as pa
>>> data = np.arange(10, dtype='int16')
>>> arr = pa.array(data)
>>> arr
<pyarrow.lib.Int16Array object at 0x7fb1d1e6ae58>
[
  0,
  1,
  2,
  3,
  4,
  5,
  6,
  7,
  8,
  9
]

Converting from NumPy supports a wide range of input dtypes, including structured dtypes or strings.

Arrow to NumPy

In the reverse direction, it is possible to produce a view of an Arrow Array for use with NumPy using the :meth:`~pyarrow.Array.to_numpy` method. This is limited to primitive types for which NumPy has the same physical representation as Arrow, and assuming the Arrow data has no nulls.

>>> import numpy as np
>>> import pyarrow as pa
>>> arr = pa.array([4, 5, 6], type=pa.int32())
>>> view = arr.to_numpy()
>>> view
array([4, 5, 6], dtype=int32)

For more complex data types, you have to use the :meth:`~pyarrow.Array.to_pandas` method (which will construct a Numpy array with Pandas semantics for, e.g., representation of null values).