Plotting a histogram

Exercise

PDAL doesn’t provide every possible analysis option, but it strives to make it convenient to link PDAL to other places with substantial functionality. One of those is the Python/Numpy universe, which is accessed through PDAL’s Python bindings and the filters.python filter. These tools allow you to manipulate point cloud data with convenient Python tools rather than constructing substantial C/C++ software to achieve simple tasks, compute simple statistics, or investigate data quality issues.

This exercise uses PDAL to create a histogram plot of all of the dimensions of a file. matplotlib is a Python package for plotting graphs and figures, and we can use it in combination with the Python bindings for PDAL to create a nice histogram. These histograms can be useful diagnostics in an analysis pipeline. We will combine a Python script to make a histogram plot with a pipeline.

Note

Python allows you to enhance and build functionality that you can use in the context of other Pipeline operations.

PDAL Pipeline

We’re going to create a PDAL Pipeline to tell PDAL to run our Python script in a filters.python stage.

 1{
 2    "pipeline": [
 3        {
 4            "filename": "./exercises/python/athletic-fields.laz"
 5        },
 6        {
 7            "type": "filters.python",
 8            "function": "make_plot",
 9            "module": "anything",
10            "pdalargs": "{\"filename\":\"./exercises/python/histogram.png\"}",
11            "script": "./exercises/python/histogram.py"
12        },
13        {
14            "type": "writers.null"
15        }
16    ]
17}

Note

This pipeline is available in your workshop materials in the ./exercises/python/histogram.json file.

Python script

The following Python script will do the actual work of creating the histogram plot with matplotlib. Store it as histogram.py next to the histogram.json Pipeline file above. The script is mostly regular Python except for the ins and outs arguments to the function – those are special arguments that PDAL expects to be a dictionary of Numpy dictionaries.

Note

This Python file is available in your workshop materials in the ./exercises/python/histogram.py file.

 1# import numpy
 2import numpy as np
 3
 4# import matplotlib stuff and make sure to use the
 5# AGG renderer.
 6import matplotlib
 7matplotlib.use('Agg')
 8import matplotlib.pyplot as plt
 9import matplotlib.mlab as mlab
10
11# This only works for Python 3. Use
12# StringIO for Python 2.
13from io import BytesIO
14
15# The make_plot function will do all of our work. The
16# filters.programmable filter expects a function name in the
17# module that has at least two arguments -- "ins" which
18# are numpy arrays for each dimension, and the "outs" which
19# the script can alter/set/adjust to have them updated for
20# further processing.
21def make_plot(ins, outs):
22
23    # figure position and row will increment
24    figure_position = 1
25    row = 1
26
27    fig = plt.figure(figure_position, figsize=(6, 8.5), dpi=300)
28
29    for key in ins:
30        dimension = ins[key]
31        ax = fig.add_subplot(len(ins.keys()), 1, row)
32
33        # histogram the current dimension with 30 bins
34        n, bins, patches = ax.hist( dimension, 30,
35                                    density=0,
36                                    facecolor='grey',
37                                    alpha=0.75,
38                                    align='mid',
39                                    histtype='stepfilled',
40                                    linewidth=None)
41
42        # Set plot particulars
43        ax.set_ylabel(key, size=10, rotation='horizontal')
44        ax.get_xaxis().set_visible(False)
45        ax.set_yticklabels('')
46        ax.set_yticks((),)
47        ax.set_xlim(min(dimension), max(dimension))
48        ax.set_ylim(min(n), max(n))
49
50        # increment plot position
51        row = row + 1
52        figure_position = figure_position + 1
53
54    # We will save the PNG bytes to a BytesIO instance
55    # and the nwrite that to a file.
56    output = BytesIO()
57    plt.savefig(output,format="PNG")
58
59    # a module global variable, called 'pdalargs' is available
60    # to filters.programmable and filters.predicate modules that contains
61    # a dictionary of arguments that can be explicitly passed into
62    # the module by the user. We passed in a filename arg in our `pdal pipeline` call
63    if 'filename' in pdalargs:
64        filename = pdalargs['filename']
65    else:
66        filename = 'histogram.png'
67
68    # open up the filename and write out the
69    # bytes of the PNG stored in the BytesIO instance
70    o = open(filename, 'wb')
71    o.write(output.getvalue())
72    o.close()
73
74
75    # filters.programmable scripts need to
76    # return True to tell the filter it was successful.
77    return True

Run pdal pipeline

1pdal pipeline ./exercises/python/histogram.json
../../../_images/python-histogram-command.png

Output

../../../_images/python-histogram.png

Notes

  1. writers.null simply swallows the output of the pipeline. We don’t need to write any data.

  2. The pdalargs JSON needs to be escaped because a valid Python dictionary entry isn’t always valid JSON.