# Plotting a histogram

## Exercise

PDAL doesn’t provide every possible analysis option, but it strives to make it convenient to link PDAL to other places with substantial functionality. One of those is the Python/Numpy universe, which is accessed through PDAL’s Python bindings and the filters.python filter. These tools allow you to manipulate point cloud data with convenient Python tools rather than constructing substantial C/C++ software to achieve simple tasks, compute simple statistics, or investigate data quality issues.

This exercise uses PDAL to create a histogram plot of all of the dimensions of a file. matplotlib is a Python package for plotting graphs and figures, and we can use it in combination with the Python bindings for PDAL to create a nice histogram. These histograms can be useful diagnostics in an analysis pipeline. We will combine a Python script to make a histogram plot with a pipeline.

Note

Python allows you to enhance and build functionality that you can use in the context of other Pipeline operations.

### PDAL Pipeline

We’re going to create a PDAL Pipeline to tell PDAL to run our Python script in a filters.python stage.

``` 1{
2    "pipeline": [
3        {
4            "filename": "./exercises/python/athletic-fields.laz"
5        },
6        {
7            "type": "filters.python",
8            "function": "make_plot",
9            "module": "anything",
10            "pdalargs": "{\"filename\":\"./exercises/python/histogram.png\"}",
11            "script": "./exercises/python/histogram.py"
12        },
13        {
14            "type": "writers.null"
15        }
16    ]
17}
```

Note

This pipeline is available in your workshop materials in the `./exercises/python/histogram.json` file.

### Python script

The following Python script will do the actual work of creating the histogram plot with matplotlib. Store it as `histogram.py` next to the `histogram.json` Pipeline file above. The script is mostly regular Python except for the `ins` and `outs` arguments to the function – those are special arguments that PDAL expects to be a dictionary of Numpy dictionaries.

Note

This Python file is available in your workshop materials in the `./exercises/python/histogram.py` file.

``` 1# import numpy
2import numpy as np
3
4# import matplotlib stuff and make sure to use the
5# AGG renderer.
6import matplotlib
7matplotlib.use('Agg')
8import matplotlib.pyplot as plt
9import matplotlib.mlab as mlab
10
11# This only works for Python 3. Use
12# StringIO for Python 2.
13from io import BytesIO
14
15# The make_plot function will do all of our work. The
16# filters.programmable filter expects a function name in the
17# module that has at least two arguments -- "ins" which
18# are numpy arrays for each dimension, and the "outs" which
19# the script can alter/set/adjust to have them updated for
20# further processing.
21def make_plot(ins, outs):
22
23    # figure position and row will increment
24    figure_position = 1
25    row = 1
26
27    fig = plt.figure(figure_position, figsize=(6, 8.5), dpi=300)
28
29    for key in ins:
30        dimension = ins[key]
31        ax = fig.add_subplot(len(ins.keys()), 1, row)
32
33        # histogram the current dimension with 30 bins
34        n, bins, patches = ax.hist( dimension, 30,
35                                    density=0,
36                                    facecolor='grey',
37                                    alpha=0.75,
38                                    align='mid',
39                                    histtype='stepfilled',
40                                    linewidth=None)
41
42        # Set plot particulars
43        ax.set_ylabel(key, size=10, rotation='horizontal')
44        ax.get_xaxis().set_visible(False)
45        ax.set_yticklabels('')
46        ax.set_yticks((),)
47        ax.set_xlim(min(dimension), max(dimension))
48        ax.set_ylim(min(n), max(n))
49
50        # increment plot position
51        row = row + 1
52        figure_position = figure_position + 1
53
54    # We will save the PNG bytes to a BytesIO instance
55    # and the nwrite that to a file.
56    output = BytesIO()
57    plt.savefig(output,format="PNG")
58
59    # a module global variable, called 'pdalargs' is available
60    # to filters.programmable and filters.predicate modules that contains
61    # a dictionary of arguments that can be explicitly passed into
62    # the module by the user. We passed in a filename arg in our `pdal pipeline` call
63    filename = pdalargs['filename'] if 'filename' in pdalargs else 'histogram.png'
64
65    # open up the filename and write out the
66    # bytes of the PNG stored in the BytesIO instance
67    with open(filename, 'wb') as o:
68        o.write(output.getvalue())
69
70
71    # filters.programmable scripts need to
72    # return True to tell the filter it was successful.
73    return True
```

### Run `pdal pipeline`

```\$ pdal pipeline ./exercises/python/histogram.json
anything:47: UserWarning: Attempting to set identical left == right == 0 results in singular transformations; automatically expanding.

\$
```

## Notes

1. writers.null simply swallows the output of the pipeline. We don’t need to write any data.

2. The `pdalargs` JSON needs to be escaped because a valid Python dictionary entry isn’t always valid JSON.