readers.numpy#
PDAL has support for processing data using filters.python, but it is also convenient to read data from Numpy for processing in PDAL.
Numpy supports saving files with the save
method, usually with the
extension .npy
. As of PDAL 1.7.0, .npz
files were not yet supported.
Warning
It is untested whether problems may occur if the versions of Python used in writing the file and for reading the file don’t match.
Array Types#
readers.numpy supports reading data in two forms:
As a structured array with specified field names (from laspy for example)
As a standard array that contains data of a single type.
Structured Arrays#
Numpy arrays can be created as structured data, where each entry is a set
of fields. Each field has a name. As an example, laspy provides its
.points
as an array of named fields:
import laspy
f = laspy.file.File('test/data/autzen/autzen.las')
print (f.points[0:1])
array([ ((63608330, 84939865, 40735, 65, 73, 1, -11, 126, 7326, 245385.60820904),)],
dtype=[('point', [('X', '<i4'), ('Y', '<i4'), ('Z', '<i4'), ('intensity', '<u2'), ('flag_byte', 'u1'), ('raw_classification', 'u1'), ('scan_angle_rank', 'i1'), ('user_data', 'u1'), ('pt_src_id', '<u2'), ('gps_time', '<f8')])])
The numpy reader supports reading these Numpy arrays and mapping
field names to standard PDAL dimension names.
If that fails, the reader retries by removing _
, -
, or space
in turn. If that also fails, the array field names are used to create
custom PDAL dimensions.
Standard (non-structured) Arrays#
Arrays without field information contain a single datatype. This datatype is
mapped to a dimension specified by the dimension
option.
f = open('./perlin.npy', 'rb')
data = np.load(f,)
data.shape
(100, 100)
data.dtype
dtype('float64')
pdal info perlin.npy --readers.numpy.dimension=Intensity --readers.numpy.assign_z=4
{
"filename": "..\/test\/data\/plang\/perlin.npy",
"pdal_version": "1.7.1 (git-version: 399e19)",
"stats":
{
"statistic":
[
{
"average": 49.5,
"count": 10000,
"maximum": 99,
"minimum": 0,
"name": "X",
"position": 0,
"stddev": 28.86967866,
"variance": 833.4583458
},
{
"average": 49.5,
"count": 10000,
"maximum": 99,
"minimum": 0,
"name": "Y",
"position": 1,
"stddev": 28.87633116,
"variance": 833.8425015
},
{
"average": 0.01112664759,
"count": 10000,
"maximum": 0.5189296418,
"minimum": -0.5189296418,
"name": "Intensity",
"position": 2,
"stddev": 0.2024120437,
"variance": 0.04097063545
}
]
}
}
X, Y and Z Mapping#
Unless the X, Y or Z dimension is specified as a field in a structured array, the reader will create dimensions X, Y and Z as necessary and populate them based on the position of each item of the array. Although Numpy arrays always contain contiguous, linear data, that data can be seen to be arranged in more than one dimension. A two-dimensional array will cause dimensions X and Y to be populated. A three dimensional array will cause X, Y and Z to be populated. An array of more than three dimensions will reuse the X, Y and Z indices for each dimension over three.
When reading data, X Y and Z can be assigned using row-major (C) order or
column-major (Fortran) order by using the order
option.
Dynamic Plugin
This stage requires a dynamic plugin to operate
Streamable Stage
This stage supports streaming operations
Loading Options#
readers.numpy supports two modes of operation - the first is to pass a
reference to a .npy
file to the filename
argument. It will simply load
it and read.
The second is to provide a reference to a .py
script to the filename
argument.
It will then invoke the Python function specified in module
and function
with
the fargs
that you provide.
Loading from a Python script#
A reference to a Python function that returns a Numpy array can also be used to tell readers.numpy what to load. The following example itself loads a Numpy array from a Python script
Python Script#
import numpy as np
def load(filename):
array = np.load(filename)
return array
Command Line Invocation#
Using the above Python file with its load
function, the following
pdal info invocation passes in the reference to the filename to load.
pdal info threedim.py \
--readers.numpy.function=load \
--readers.numpy.fargs=threedim.npy \
--driver readers.numpy
Pipeline#
An example Pipeline definition would follow:
[
{
"function": "load",
"filename": "threedim.py",
"fargs": "threedim.npy",
"type": "readers.numpy"
},
...
]
Options#
- filename
npy file to read or optionally, a .py file that defines a function that returns a Numpy array using the
module
,function
, andfargs
options. [Required]
- count
Maximum number of points to read. [Default: unlimited]
- override_srs
Spatial reference to apply to the data. Overrides any SRS in the input itself. Can be specified as a WKT, PROJ or EPSG string. Can’t use with ‘default_srs’. [Default: none]
- default_srs
Spatial reference to apply to the data if the input does not specify one. Can be specified as a WKT, PROJ or EPSG string. Can’t use with ‘override_srs’. [Default: none]
- dimension
Dimension name to map raster values
- order
Either ‘row’ or ‘column’ to specify assigning the X,Y and Z values in a row-major or column-major order. [Default: matches the natural order of the array.]
- module
The Python module name that is holding the function to run.
- function
The function name in the module to call.
- fargs
The function args to pass to the function
Note
The functionality of the ‘assign_z’ option in previous versions is provided with filters.assign
The functionality of the ‘x’, ‘y’, and ‘z’ options in previous versions are generally handled with the current ‘order’ option.