Hypsometric Relationship
Warning
This library is under development, none of the presented solutions are available for download.
Estimate the heights of the missing trees based on the heights measured in the field.
Class Parameters
HypRel(x, y, df, model, iterator)
Parameters | Description |
---|---|
x | The name of the column that contains the tree diameters/circumferences. |
y | The name of the column that contains the tree heights. |
df | The DataFrame containing the tree data. |
model | (Optional) A list of models used for estimating tree heights. If none, will use all models avaliable. Available models are: ['curtis', 'parabolic', 'stofel', 'henriksen', 'prodan_i', 'prodan_ii', 'smd_fm', 'ann'] . |
iterator | (Optional) A column name string. Defines wich column will be used as a iterator. Could be a farm name, plot name, code or any unique identification tag. |
Class Methods
HypRel.run()
HypRel.view_metrics()
HypRel.plots(dir = None, show = None)#(1)!
HypRel.get_coef()
HypRel.predict()
- dir = The directory you want to save your plots!
If
dir == None
, then the plots will be displayed.
show = Display the plots on the screen! It can beTrue
orFalse
.
Methods | Description |
---|---|
.run() | Fit the models |
.view_metrics() | Return a table of metrics of each evaluated model |
.plots(dir=None, show=True) | Return the height and residuals plots |
.get_coef() | Return the coefficients for each model |
.predict() | Return the predict heights and used models in new columns |
Example Usage
Using the data from Scolforo (2005), from a Pinus taeda stand ranging from 15 to 19 years old, with 5 plots of 420 m² measured, we can fit models to predict the missing heights.
First 5 rows of the file:
Parcela | Dap (cm) | H (m) | Idade (anos) |
---|---|---|---|
p-1 | 22.28 | 0.0 | 15 |
p-1 | 23.87 | 22.2 | 15 |
p-1 | 25.46 | 0.0 | 15 |
p-1 | 25.78 | 24.5 | 15 |
p-1 | 26.74 | 22.2 | 15 |
- Import
HypRel
class. - Import
pandas
for data manipulation.
Create a variable for the HypRel Class
hyp_rel_example.py | |
---|---|
- Load your
.xlsx
file. - Create the variable
reg
containing theHypRel
class. Sincemodel
is not declared, it will use all available models.
If you want to use a specific model, setmodel=['curtis']
for example, and it will use only the Curtis model.
If you want to fit the models for each plot individually, useiterator="Parcela"
.
Example:reg = HypRel('Dap (cm)',"H (m)",df, iterator="Parcela")
- Run the models and save them to the
results
variable. - Evaluate the fitted models and save the metrics to the
metrics
variable. - Generate the plots for the fitted models.
- Retrieve the coefficients of each fitted model.
- Retrieve the final heights and the models used for the estimation.
In this case, iterator
and model
were not declared, so all equations were fitted to the entire dataset.
These were the outputs:
Outputs
Tables
results
(1)
- DataFrame containing the estimated heights for each model and the field-measured height in the "Real Height" column.
curtis | parabolic | stofel | henriksen | prodan_i | prodan_ii | smd_fm | ann | Real Height |
---|---|---|---|---|---|---|---|---|
21.11 | 22.19 | 22.33 | 21.78 | 22.81 | 22.77 | 21.12 | 22.82 | 0.00 |
22.21 | 22.97 | 23.07 | 22.71 | 23.39 | 23.35 | 22.26 | 23.45 | 22.20 |
23.22 | 23.73 | 23.79 | 23.58 | 23.98 | 23.95 | 23.29 | 24.08 | 0.00 |
23.42 | 23.88 | 23.93 | 23.75 | 24.10 | 24.08 | 23.49 | 24.20 | 24.50 |
23.98 | 24.33 | 24.35 | 24.24 | 24.46 | 24.44 | 24.06 | 24.58 | 22.20 |
metrics
(1)
- DataFrame containing the metrics obtained for each model, assigning a
score=10
to the best model.
Model | MAE | MAPE | MSE | RMSE | R squared | Explained Var | Mean Error | score |
---|---|---|---|---|---|---|---|---|
henriksen | 2.2125 | 7.6139 | 6.9901 | 2.6439 | 0.4163 | 0.4163 | 5.08E-15 | 10 |
curtis | 2.1993 | 7.5325 | 7.0147 | 2.6485 | 0.4142 | 0.4154 | 0.1182 | 9 |
smd_fm | 2.2099 | 7.6004 | 7.0020 | 2.6461 | 0.4153 | 0.4153 | -0.0015 | 8 |
stofel | 2.2060 | 7.5649 | 7.0210 | 2.6497 | 0.4137 | 0.4148 | 0.1177 | 7 |
parabolic | 2.2183 | 7.6358 | 7.0099 | 2.6476 | 0.4146 | 0.4146 | 5.47E-16 | 6 |
ann | 2.2194 | 7.6453 | 7.0312 | 2.6516 | 0.4128 | 0.4128 | 0.0002 | 5 |
prodan_ii | 2.2008 | 7.5168 | 7.0921 | 2.6631 | 0.4077 | 0.4127 | 0.2434 | 4 |
prodan_i | 2.2020 | 7.5241 | 7.0886 | 2.6625 | 0.4080 | 0.4125 | 0.2323 | 3 |
df_coefficients
(1)
- DataFrame containing the coefficients of each model and indicating which one was selected as the best model.
Sinceiterator
was not declared, the column remains empty.
iterator | model | equation | b0 | b1 | b2 | selected_model |
---|---|---|---|---|---|---|
curtis | ln(h) = b0 + b1·(1/x) | 3.8139 | -17.0254 | False | ||
parabolic | h = b0 + b1·x + b2·x² | 8.9680 | 0.6891 | -0.0043 | False | |
stofel | ln(h) = b0 + b1·ln(x) | 1.6300 | 0.4755 | False | ||
henriksen | h = b0 + b1·ln(x) | -20.1429 | 13.5058 | True | ||
prodan_i | h = x² / (b0 + b1·x + b2·x²) | -7.7832 | 1.0345 | 0.0131 | False | |
prodan_ii | h - 1.3 = x² / (b0 + b1·x + b2·x²) | -7.9302 | 1.1027 | 0.0131 | False | |
smd_fm | y = log(y) ~ x = 1/x + 1/x² | 3.8008 | -16.0925 | False |
final_results
(1)
- Initial DataFrame containing two new columns:
best_predicted_height
with the height estimated by the best model.
selected_model
indicating which was the best model.
Plot | DBH (cm) | H (m) | Age (years) | best_predicted_height | selected_model |
---|---|---|---|---|---|
p-1 | 22.28 | 0 | 15 | 21.77502772 | henriksen |
p-1 | 23.87 | 22.2 | 15 | 22.2 | henriksen |
p-1 | 25.46 | 0 | 15 | 23.57696518 | henriksen |
p-1 | 25.78 | 24.5 | 15 | 24.5 | henriksen |
p-1 | 26.74 | 22.2 | 15 | 22.2 | henriksen |
If you want each model to be fitted to each individual plot, simply replace the line:
reg = HypRel('Dap (cm)',"H (m)",df)
with:
reg = HypRel('Dap (cm)',"H (m)",df, iterator="Fazenda")
Plots
Since the line reg.plots(r'C:/Your/path/to_save')
specified a directory for saving the generated plots, two folders will be created in this directory:
A folder named heights
containing the plots of the fitted curves.
A folder named residuals
containing the residual plots from the fittings.


Available models
Adaptation of the "Forest Mensuration" julia package by SILVA (2022), used to perform regressions using different types of transformations of diameter at breast height and height in hypsometric relationship processes.
Transformations of Y
- \( y \)
- \( \log(y) \)
- \( \log(y - 1.3) \)
- \( \log(1 + y) \)
- \( \frac{1}{y} \)
- \( \frac{1}{y - 1.3} \)
- \( \frac{1}{\sqrt{y}} \)
- \( \frac{1}{\sqrt{y - 1.3}} \)
- \( \frac{x}{\sqrt{y}} \)
- \( \frac{x}{\sqrt{y - 1.3}} \)
- \( \frac{x^2}{y} \)
- \( \frac{x^2}{y - 1.3} \)
Transformations of X
- \( x \)
- \( x^2 \)
- \( \log(x) \)
- \( \log(x)^2 \)
- \( \frac{1}{x} \)
- \( \frac{1}{x^2} \)
- \( x + x^2 \)
- \( x + \log(x) \)
- \( x + \log(x)^2 \)
- \( x + \frac{1}{x} \)
- \( x + \frac{1}{x^2} \)
- \( x^2 + \log(x) \)
- \( x^2 + \log(x)^2 \)
- \( x^2 + \frac{1}{x} \)
- \( \log(x) + \log(x)^2 \)
- \( \log(x) + \frac{1}{x} \)
- \( \log(x) + \frac{1}{x^2} \)
- \( \log(x)^2 + \frac{1}{x} \)
- \( \log(x)^2 + \frac{1}{x^2} \)
- \( \frac{1}{x} + \frac{1}{x^2} \)
Explanation about ANN below.
Artificial Neural Network
When selecting the 'ann' model, 4 different structures of artificial neural networks will be tested. Only the result from 1 model will be returned. The model returned will be selected by the ranking function.
For the 'ann' model, the module sklearn.neural_network.MLPRegressor is used.
Ranking function
To select the best-performing models and rank them accordingly, the following metrics are obtained:
Métric name | Structure |
---|---|
Mean Absolute Error (MAE) | \( MAE = \frac{1}{n} \sum_{i=1}^{n} \|y_i - \hat{y}_i\| \) |
Mean Absolute Percentage Error (MAPE) | \( MAPE = \frac{100}{n} \sum_{i=1}^{n} \left\|\frac{y_i - \hat{y}_i}{y_i}\right\| \) |
Mean Squared Error (MSE) | \( MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \) |
Root Mean Squared Error (RMSE) | \( RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} \) |
R Squared (Coefficient of Determination) | \( R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2} \) |
Explained Variance (EV) | \( EV = 1 - \frac{Var(y - \hat{y})}{Var(y)} \) |
Mean Error | \( Mean\ Error = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i) \) |
After obtaining the metrics for each tested model, the best model receives a score of 10, while the others receive scores of 9, 8, and so on.
References
CURTIS, R. O. (1967). Height-Diameter and Height-Diameter-Age Equations For Second-Growth Douglas-Fir. Forest Science, 13(4), 365–375. https://doi.org/10.1093/forestscience/13.4.365
SCOLFORO, J. R. S. (2005). Biometria Florestal: Parte I: Modelos de regressão linear e não-linear; Parte II: Modelos para relação hipsométrica, volume, afilamento e preso de matéria seca. Lavras: UFLA/FAEPE, pp. 224–226.
SILVA, M. D. (2022). Forest Mensuration.jl: Uma Introdução à Aplicações em Julia. 128 p. Trabalho de Conclusão de Curso (Graduação em Engenharia Florestal) – Universidade Federal de Santa Maria, Frederico Westphalen, RS, 2022.
JAMES, G.; WITTEN, D.; HASTIE, T.; TIBSHIRANI, R. (2013). An Introduction to Statistical Learning. In Springer Texts in Statistics. Springer New York. https://doi.org/10.1007/978-1-4614-7138-7