Skip to content

Hypsometric Relationship


Warning

This library is under development, none of the presented solutions are available for download.

Estimate the heights of the missing trees based on the heights measured in the field.


Class Parameters

HypRel(x, y, df, model, iterator)
Parameters Description
x The name of the column that contains the tree diameters/circumferences.
y The name of the column that contains the tree heights.
df The DataFrame containing the tree data.
model (Optional) A list of models used for estimating tree heights. If none, will use all models avaliable. Available models are: ['curtis', 'parabolic', 'stofel', 'henriksen', 'prodan_i', 'prodan_ii', 'smd_fm', 'ann'].
iterator (Optional) A column name string. Defines wich column will be used as a iterator. Could be a farm name, plot name, code or any unique identification tag.

Class Methods

functions and parameters
  HypRel.run()  
  HypRel.view_metrics()  
  HypRel.plots(dir = None, show = None)#(1)!
  HypRel.get_coef()
  HypRel.predict()

  1. dir = The directory you want to save your plots! If dir == None, then the plots will be displayed.
    show = Display the plots on the screen! It can be True or False.
Methods Description
.run() Fit the models
.view_metrics() Return a table of metrics of each evaluated model
.plots(dir=None, show=True) Return the height and residuals plots
.get_coef() Return the coefficients for each model
.predict() Return the predict heights and used models in new columns

Example Usage

Using the data from Scolforo (2005), from a Pinus taeda stand ranging from 15 to 19 years old, with 5 plots of 420 m² measured, we can fit models to predict the missing heights.

Download example file.

First 5 rows of the file:

Parcela Dap (cm) H (m) Idade (anos)
p-1 22.28 0.0 15
p-1 23.87 22.2 15
p-1 25.46 0.0 15
p-1 25.78 24.5 15
p-1 26.74 22.2 15

hyp_rel_example.py
1
2
3
from fptools.hyp_rel import HypRel#(1)!

import pandas as pd#(2)!

  1. Import HypRel class.
  2. Import pandas for data manipulation.

Create a variable for the HypRel Class

hyp_rel_example.py
df = pd.read_excel(r'C:/your/directory/exemplo_scolforo.xlsx')#(1)!

reg = HypRel('Dap (cm)',"H (m)",df)#(2)!

results = reg.run()#(3)!

metrics = reg.view_metrics()#(4)!

reg.plots(r'C:/Your/path/to_save')#(5)!

df_coefficients =  reg.get_coef()#(6)!

final_results =  reg.predict()#(7)!

  1. Load your .xlsx file.
  2. Create the variable reg containing the HypRel class. Since model is not declared, it will use all available models.
    If you want to use a specific model, set model=['curtis'] for example, and it will use only the Curtis model.
    If you want to fit the models for each plot individually, use iterator="Parcela".
    Example: reg = HypRel('Dap (cm)',"H (m)",df, iterator="Parcela")
  3. Run the models and save them to the results variable.
  4. Evaluate the fitted models and save the metrics to the metrics variable.
  5. Generate the plots for the fitted models.
  6. Retrieve the coefficients of each fitted model.
  7. Retrieve the final heights and the models used for the estimation.

In this case, iterator and model were not declared, so all equations were fitted to the entire dataset.
These were the outputs:

Outputs

Tables

results(1)

  1. DataFrame containing the estimated heights for each model and the field-measured height in the "Real Height" column.
curtis parabolic stofel henriksen prodan_i prodan_ii smd_fm ann Real Height
21.11 22.19 22.33 21.78 22.81 22.77 21.12 22.82 0.00
22.21 22.97 23.07 22.71 23.39 23.35 22.26 23.45 22.20
23.22 23.73 23.79 23.58 23.98 23.95 23.29 24.08 0.00
23.42 23.88 23.93 23.75 24.10 24.08 23.49 24.20 24.50
23.98 24.33 24.35 24.24 24.46 24.44 24.06 24.58 22.20

metrics(1)

  1. DataFrame containing the metrics obtained for each model, assigning a score=10 to the best model.
Model MAE MAPE MSE RMSE R squared Explained Var Mean Error score
henriksen 2.2125 7.6139 6.9901 2.6439 0.4163 0.4163 5.08E-15 10
curtis 2.1993 7.5325 7.0147 2.6485 0.4142 0.4154 0.1182 9
smd_fm 2.2099 7.6004 7.0020 2.6461 0.4153 0.4153 -0.0015 8
stofel 2.2060 7.5649 7.0210 2.6497 0.4137 0.4148 0.1177 7
parabolic 2.2183 7.6358 7.0099 2.6476 0.4146 0.4146 5.47E-16 6
ann 2.2194 7.6453 7.0312 2.6516 0.4128 0.4128 0.0002 5
prodan_ii 2.2008 7.5168 7.0921 2.6631 0.4077 0.4127 0.2434 4
prodan_i 2.2020 7.5241 7.0886 2.6625 0.4080 0.4125 0.2323 3

df_coefficients(1)

  1. DataFrame containing the coefficients of each model and indicating which one was selected as the best model.
    Since iterator was not declared, the column remains empty.
iterator model equation b0 b1 b2 selected_model
curtis ln(h) = b0 + b1·(1/x) 3.8139 -17.0254 False
parabolic h = b0 + b1·x + b2·x² 8.9680 0.6891 -0.0043 False
stofel ln(h) = b0 + b1·ln(x) 1.6300 0.4755 False
henriksen h = b0 + b1·ln(x) -20.1429 13.5058 True
prodan_i h = x² / (b0 + b1·x + b2·x²) -7.7832 1.0345 0.0131 False
prodan_ii h - 1.3 = x² / (b0 + b1·x + b2·x²) -7.9302 1.1027 0.0131 False
smd_fm y = log(y) ~ x = 1/x + 1/x² 3.8008 -16.0925 False

final_results(1)

  1. Initial DataFrame containing two new columns:
    best_predicted_height with the height estimated by the best model.
    selected_model indicating which was the best model.
Plot DBH (cm) H (m) Age (years) best_predicted_height selected_model
p-1 22.28 0 15 21.77502772 henriksen
p-1 23.87 22.2 15 22.2 henriksen
p-1 25.46 0 15 23.57696518 henriksen
p-1 25.78 24.5 15 24.5 henriksen
p-1 26.74 22.2 15 22.2 henriksen

If you want each model to be fitted to each individual plot, simply replace the line:
reg = HypRel('Dap (cm)',"H (m)",df)
with:
reg = HypRel('Dap (cm)',"H (m)",df, iterator="Fazenda")

Plots

Since the line reg.plots(r'C:/Your/path/to_save') specified a directory for saving the generated plots, two folders will be created in this directory:
A folder named heights containing the plots of the fitted curves.
A folder named residuals containing the residual plots from the fittings.

Example of a plot generated with the fitted curve for the Henriksen model
fitted curve
Example of a residual plot generated for the Henriksen model
residuals

flowchart LR subgraph run runText1[Runs all available models] end subgraph view_metrics runText2[Returns a DataFrame with the metrics of the fitted models] end subgraph plots runText3[Generates plots] end subgraph coefficients runText4[Returns a DataFrame with the coefficients of the fitted models] end subgraph predict runText5[Returns the original DataFrame with a new column containing the estimated heights] end %% Links to subgraphs: HypRel-Module --> run HypRel-Module --> view_metrics HypRel-Module --> plots HypRel-Module --> coefficients HypRel-Module --> predict

Available models

  • curtis
  • \[ \operatorname{Total height} =e^{(\beta_0+β1*\frac{1}{x})} \]

  • parabolic
  • \[ \operatorname{Total height} = \beta_0 + \beta_1 * x + \beta_2 * x^2 \]

  • stofel
  • \[ \operatorname{Total height} = e^{(\beta_0+\beta_1*\ln(x))} \]

  • henriksen
  • \[ \operatorname{Total height} = \beta_0 + \beta_1 * \ln(x) \]

  • prodan_i
  • \[ \operatorname{Total height} = (\frac{x^2}{\beta_0+\beta_1*x+\beta_2* x^2}) \]

  • prodan_ii
  • \[ \operatorname{Total height} =(\frac{x^2}{\beta_0+\beta_1*x+\beta_2* x^2})+1.3 \]

  • smd_fm
  • Adaptation of the "Forest Mensuration" julia package by SILVA (2022), used to perform regressions using different types of transformations of diameter at breast height and height in hypsometric relationship processes.

    Transformations of Y

    • \( y \)
    • \( \log(y) \)
    • \( \log(y - 1.3) \)
    • \( \log(1 + y) \)
    • \( \frac{1}{y} \)
    • \( \frac{1}{y - 1.3} \)
    • \( \frac{1}{\sqrt{y}} \)
    • \( \frac{1}{\sqrt{y - 1.3}} \)
    • \( \frac{x}{\sqrt{y}} \)
    • \( \frac{x}{\sqrt{y - 1.3}} \)
    • \( \frac{x^2}{y} \)
    • \( \frac{x^2}{y - 1.3} \)

    Transformations of X

    • \( x \)
    • \( x^2 \)
    • \( \log(x) \)
    • \( \log(x)^2 \)
    • \( \frac{1}{x} \)
    • \( \frac{1}{x^2} \)
    • \( x + x^2 \)
    • \( x + \log(x) \)
    • \( x + \log(x)^2 \)
    • \( x + \frac{1}{x} \)
    • \( x + \frac{1}{x^2} \)
    • \( x^2 + \log(x) \)
    • \( x^2 + \log(x)^2 \)
    • \( x^2 + \frac{1}{x} \)
    • \( \log(x) + \log(x)^2 \)
    • \( \log(x) + \frac{1}{x} \)
    • \( \log(x) + \frac{1}{x^2} \)
    • \( \log(x)^2 + \frac{1}{x} \)
    • \( \log(x)^2 + \frac{1}{x^2} \)
    • \( \frac{1}{x} + \frac{1}{x^2} \)

  • ann
  • Explanation about ANN below.

    Artificial Neural Network

    When selecting the 'ann' model, 4 different structures of artificial neural networks will be tested. Only the result from 1 model will be returned. The model returned will be selected by the ranking function.
    For the 'ann' model, the module sklearn.neural_network.MLPRegressor is used.

    --- title: ANN parameters --- classDiagram class MLPRegressor { Epochs: 3000 Activation: logistic Solver Mode: lbfgs Batch size: dynamic Larning rate init: 0.1 Learning rate mode: adaptive } class Model-0 { Hidden layer sizes: (4,5) } class Model_1 { Hidden layer sizes: (4,2) } class Model_2 { Hidden layer sizes: (3,2) } class Model_3 { Hidden layer sizes: (4,4) } MLPRegressor <|-- Model-0 MLPRegressor <|-- Model_1 MLPRegressor <|-- Model_2 MLPRegressor <|-- Model_3

    Ranking function

    To select the best-performing models and rank them accordingly, the following metrics are obtained:

    Métric name Structure
    Mean Absolute Error (MAE) \( MAE = \frac{1}{n} \sum_{i=1}^{n} \|y_i - \hat{y}_i\| \)
    Mean Absolute Percentage Error (MAPE) \( MAPE = \frac{100}{n} \sum_{i=1}^{n} \left\|\frac{y_i - \hat{y}_i}{y_i}\right\| \)
    Mean Squared Error (MSE) \( MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \)
    Root Mean Squared Error (RMSE) \( RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} \)
    R Squared (Coefficient of Determination) \( R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2} \)
    Explained Variance (EV) \( EV = 1 - \frac{Var(y - \hat{y})}{Var(y)} \)
    Mean Error \( Mean\ Error = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i) \)

    After obtaining the metrics for each tested model, the best model receives a score of 10, while the others receive scores of 9, 8, and so on.

    References

    CURTIS, R. O. (1967). Height-Diameter and Height-Diameter-Age Equations For Second-Growth Douglas-Fir. Forest Science, 13(4), 365–375. https://doi.org/10.1093/forestscience/13.4.365

    SCOLFORO, J. R. S. (2005). Biometria Florestal: Parte I: Modelos de regressão linear e não-linear; Parte II: Modelos para relação hipsométrica, volume, afilamento e preso de matéria seca. Lavras: UFLA/FAEPE, pp. 224–226.

    SILVA, M. D. (2022). Forest Mensuration.jl: Uma Introdução à Aplicações em Julia. 128 p. Trabalho de Conclusão de Curso (Graduação em Engenharia Florestal) – Universidade Federal de Santa Maria, Frederico Westphalen, RS, 2022.

    JAMES, G.; WITTEN, D.; HASTIE, T.; TIBSHIRANI, R. (2013). An Introduction to Statistical Learning. In Springer Texts in Statistics. Springer New York. https://doi.org/10.1007/978-1-4614-7138-7