Hypsometric Relationship

Warning

This library is under development, none of the presented solutions are available for download.

Estimate the heights of the missing trees based on the heights measured in the field.

Class Parameters

HypRel(x, y, df, model, iterator)

Parameters	Description
x	The name of the column that contains the tree diameters/circumferences.
y	The name of the column that contains the tree heights.
df	The DataFrame containing the tree data.
model	(Optional) A list of models used for estimating tree heights. If none, will use all models avaliable. Available models are: `['curtis', 'parabolic', 'stofel', 'henriksen', 'prodan_i', 'prodan_ii', 'smd_fm', 'ann']`.
iterator	(Optional) A column name string. Defines wich column will be used as a iterator. Could be a farm name, plot name, code or any unique identification tag.

Class Methods

functions and parameters

  HypRel.run()  
  HypRel.view_metrics()  
  HypRel.plots(dir = None, show = None)#(1)!
  HypRel.get_coef()
  HypRel.predict()

dir = The directory you want to save your plots! If dir == None, then the plots will be displayed.
show = Display the plots on the screen! It can be True or False.

Methods	Description
.run()	Fit the models
.view_metrics()	Return a table of metrics of each evaluated model
.plots(dir=None, show=True)	Return the height and residuals plots
.get_coef()	Return the coefficients for each model
.predict()	Return the predict heights and used models in new columns

Example Usage

Using the data from Scolforo (2005), from a Pinus taeda stand ranging from 15 to 19 years old, with 5 plots of 420 m² measured, we can fit models to predict the missing heights.

Download example file.

First 5 rows of the file:

Parcela	Dap (cm)	H (m)	Idade (anos)
p-1	22.28	0.0	15
p-1	23.87	22.2	15
p-1	25.46	0.0	15
p-1	25.78	24.5	15
p-1	26.74	22.2	15

hyp_rel_example.py
from fptools.hyp_rel import HypRel#(1)!

import pandas as pd#(2)!

Import HypRel class.
Import pandas for data manipulation.

Create a variable for the HypRel Class

hyp_rel_example.py
df = pd.read_excel(r'C:/your/directory/exemplo_scolforo.xlsx')#(1)!

reg = HypRel('Dap (cm)',"H (m)",df)#(2)!

results = reg.run()#(3)!

metrics = reg.view_metrics()#(4)!

reg.plots(r'C:/Your/path/to_save')#(5)!

df_coefficients =  reg.get_coef()#(6)!

final_results =  reg.predict()#(7)!

Load your .xlsx file.
Create the variable reg containing the HypRel class. Since model is not declared, it will use all available models.
If you want to use a specific model, set model=['curtis'] for example, and it will use only the Curtis model.
If you want to fit the models for each plot individually, use iterator="Parcela".
Example: reg = HypRel('Dap (cm)',"H (m)",df, iterator="Parcela")
Run the models and save them to the results variable.
Evaluate the fitted models and save the metrics to the metrics variable.
Generate the plots for the fitted models.
Retrieve the coefficients of each fitted model.
Retrieve the final heights and the models used for the estimation.

In this case, iterator and model were not declared, so all equations were fitted to the entire dataset.
These were the outputs:

Outputs

Tables

results(1)

DataFrame containing the estimated heights for each model and the field-measured height in the "Real Height" column.

curtis	parabolic	stofel	henriksen	prodan_i	prodan_ii	smd_fm	ann	Real Height
21.11	22.19	22.33	21.78	22.81	22.77	21.12	22.82	0.00
22.21	22.97	23.07	22.71	23.39	23.35	22.26	23.45	22.20
23.22	23.73	23.79	23.58	23.98	23.95	23.29	24.08	0.00
23.42	23.88	23.93	23.75	24.10	24.08	23.49	24.20	24.50
23.98	24.33	24.35	24.24	24.46	24.44	24.06	24.58	22.20

metrics(1)

DataFrame containing the metrics obtained for each model, assigning a score=10 to the best model.

Model	MAE	MAPE	MSE	RMSE	R squared	Explained Var	Mean Error	score
henriksen	2.2125	7.6139	6.9901	2.6439	0.4163	0.4163	5.08E-15	10
curtis	2.1993	7.5325	7.0147	2.6485	0.4142	0.4154	0.1182	9
smd_fm	2.2099	7.6004	7.0020	2.6461	0.4153	0.4153	-0.0015	8
stofel	2.2060	7.5649	7.0210	2.6497	0.4137	0.4148	0.1177	7
parabolic	2.2183	7.6358	7.0099	2.6476	0.4146	0.4146	5.47E-16	6
ann	2.2194	7.6453	7.0312	2.6516	0.4128	0.4128	0.0002	5
prodan_ii	2.2008	7.5168	7.0921	2.6631	0.4077	0.4127	0.2434	4
prodan_i	2.2020	7.5241	7.0886	2.6625	0.4080	0.4125	0.2323	3

df_coefficients(1)

DataFrame containing the coefficients of each model and indicating which one was selected as the best model.
Since iterator was not declared, the column remains empty.

model	equation	b0	b1	b2	selected_model
curtis	ln(h) = b0 + b1·(1/x)	3.8139	-17.0254		False
parabolic	h = b0 + b1·x + b2·x²	8.9680	0.6891	-0.0043	False
stofel	ln(h) = b0 + b1·ln(x)	1.6300	0.4755		False
henriksen	h = b0 + b1·ln(x)	-20.1429	13.5058		True
prodan_i	h = x² / (b0 + b1·x + b2·x²)	-7.7832	1.0345	0.0131	False
prodan_ii	h - 1.3 = x² / (b0 + b1·x + b2·x²)	-7.9302	1.1027	0.0131	False
smd_fm	y = log(y) ~ x = 1/x + 1/x²	3.8008	-16.0925		False

final_results(1)

Initial DataFrame containing two new columns:
best_predicted_height with the height estimated by the best model.
selected_model indicating which was the best model.

Plot	DBH (cm)	H (m)	Age (years)	best_predicted_height	selected_model
p-1	22.28	0	15	21.77502772	henriksen
p-1	23.87	22.2	15	22.2	henriksen
p-1	25.46	0	15	23.57696518	henriksen
p-1	25.78	24.5	15	24.5	henriksen
p-1	26.74	22.2	15	22.2	henriksen

If you want each model to be fitted to each individual plot, simply replace the line:
reg = HypRel('Dap (cm)',"H (m)",df)
with:
reg = HypRel('Dap (cm)',"H (m)",df, iterator="Fazenda")

Plots

Since the line reg.plots(r'C:/Your/path/to_save') specified a directory for saving the generated plots, two folders will be created in this directory:
A folder named heights containing the plots of the fitted curves.
A folder named residuals containing the residual plots from the fittings.

Example of a plot generated with the fitted curve for the Henriksen model

residuals — Example of a residual plot generated for the Henriksen model

flowchart LR subgraph run runText1[Runs all available models] end subgraph view_metrics runText2[Returns a DataFrame with the metrics of the fitted models] end subgraph plots runText3[Generates plots] end subgraph coefficients runText4[Returns a DataFrame with the coefficients of the fitted models] end subgraph predict runText5[Returns the original DataFrame with a new column containing the estimated heights] end %% Links to subgraphs: HypRel-Module --> run HypRel-Module --> view_metrics HypRel-Module --> plots HypRel-Module --> coefficients HypRel-Module --> predict

Available models

curtis

\[ \operatorname{Total height} =e^{(\beta_0+β1*\frac{1}{x})} \]

parabolic

\[ \operatorname{Total height} = \beta_0 + \beta_1 * x + \beta_2 * x^2 \]

stofel

\[ \operatorname{Total height} = e^{(\beta_0+\beta_1*\ln(x))} \]

henriksen

\[ \operatorname{Total height} = \beta_0 + \beta_1 * \ln(x) \]

prodan_i

\[ \operatorname{Total height} = (\frac{x^2}{\beta_0+\beta_1*x+\beta_2* x^2}) \]

prodan_ii

\[ \operatorname{Total height} =(\frac{x^2}{\beta_0+\beta_1*x+\beta_2* x^2})+1.3 \]

smd_fm

Adaptation of the "Forest Mensuration" julia package by SILVA (2022), used to perform regressions using different types of transformations of diameter at breast height and height in hypsometric relationship processes.

Transformations of Y

\( y \)
\( \log(y) \)
\( \log(y - 1.3) \)
\( \log(1 + y) \)
\( \frac{1}{y} \)
\( \frac{1}{y - 1.3} \)
\( \frac{1}{\sqrt{y}} \)
\( \frac{1}{\sqrt{y - 1.3}} \)
\( \frac{x}{\sqrt{y}} \)
\( \frac{x}{\sqrt{y - 1.3}} \)
\( \frac{x^2}{y} \)
\( \frac{x^2}{y - 1.3} \)

Transformations of X

\( x \)
\( x^2 \)
\( \log(x) \)
\( \log(x)^2 \)
\( \frac{1}{x} \)
\( \frac{1}{x^2} \)
\( x + x^2 \)
\( x + \log(x) \)
\( x + \log(x)^2 \)
\( x + \frac{1}{x} \)
\( x + \frac{1}{x^2} \)
\( x^2 + \log(x) \)
\( x^2 + \log(x)^2 \)
\( x^2 + \frac{1}{x} \)
\( \log(x) + \log(x)^2 \)
\( \log(x) + \frac{1}{x} \)
\( \log(x) + \frac{1}{x^2} \)
\( \log(x)^2 + \frac{1}{x} \)
\( \log(x)^2 + \frac{1}{x^2} \)
\( \frac{1}{x} + \frac{1}{x^2} \)

ann

Explanation about ANN below.

Artificial Neural Network

When selecting the 'ann' model, 4 different structures of artificial neural networks will be tested. Only the result from 1 model will be returned. The model returned will be selected by the ranking function.
For the 'ann' model, the module sklearn.neural_network.MLPRegressor is used.

--- title: ANN parameters --- classDiagram class MLPRegressor { Epochs: 3000 Activation: logistic Solver Mode: lbfgs Batch size: dynamic Larning rate init: 0.1 Learning rate mode: adaptive } class Model-0 { Hidden layer sizes: (4,5) } class Model_1 { Hidden layer sizes: (4,2) } class Model_2 { Hidden layer sizes: (3,2) } class Model_3 { Hidden layer sizes: (4,4) } MLPRegressor <|-- Model-0 MLPRegressor <|-- Model_1 MLPRegressor <|-- Model_2 MLPRegressor <|-- Model_3

Ranking function

To select the best-performing models and rank them accordingly, the following metrics are obtained:

Métric name	Structure
Mean Absolute Error (MAE)	\( MAE = \frac{1}{n} \sum_{i=1}^{n} \\|y_i - \hat{y}_i\\| \)
Mean Absolute Percentage Error (MAPE)	\( MAPE = \frac{100}{n} \sum_{i=1}^{n} \left\\|\frac{y_i - \hat{y}_i}{y_i}\right\\| \)
Mean Squared Error (MSE)	\( MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \)
Root Mean Squared Error (RMSE)	\( RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2} \)
R Squared (Coefficient of Determination)	\( R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2} \)
Explained Variance (EV)	\( EV = 1 - \frac{Var(y - \hat{y})}{Var(y)} \)
Mean Error	\( Mean\ Error = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i) \)

After obtaining the metrics for each tested model, the best model receives a score of 10, while the others receive scores of 9, 8, and so on.

References

CURTIS, R. O. (1967). Height-Diameter and Height-Diameter-Age Equations For Second-Growth Douglas-Fir. Forest Science, 13(4), 365–375. https://doi.org/10.1093/forestscience/13.4.365

SCOLFORO, J. R. S. (2005). Biometria Florestal: Parte I: Modelos de regressão linear e não-linear; Parte II: Modelos para relação hipsométrica, volume, afilamento e preso de matéria seca. Lavras: UFLA/FAEPE, pp. 224–226.

SILVA, M. D. (2022). Forest Mensuration.jl: Uma Introdução à Aplicações em Julia. 128 p. Trabalho de Conclusão de Curso (Graduação em Engenharia Florestal) – Universidade Federal de Santa Maria, Frederico Westphalen, RS, 2022.

JAMES, G.; WITTEN, D.; HASTIE, T.; TIBSHIRANI, R. (2013). An Introduction to Statistical Learning. In Springer Texts in Statistics. Springer New York. https://doi.org/10.1007/978-1-4614-7138-7