| Title: | Extreme Value Modeling for r-Largest Order Statistics |
|---|---|
| Description: | Tools for extreme value modeling based on the r-largest order statistics framework. The package provides functions for parameter estimation via maximum likelihood, return level estimation with standard errors, profile likelihood-based confidence intervals, random sample generation, and entropy difference tests for selecting the number of order statistics r. Several r-largest order statistics models are implemented, including the four-parameter kappa (rK4D), generalized logistic (rGLO), generalized Gumbel (rGGD), logistic (rLD), and Gumbel (rGD) distributions. The rK4D methodology is described in Shin et al. (2022) <doi:10.1016/j.wace.2022.100533>, the rGLO model in Shin and Park (2024) <doi:10.1007/s00477-023-02642-7>, and the rGGD model in Shin and Park (2025) <doi:10.1038/s41598-024-83273-y>. The underlying distributions are related to the kappa distribution of Hosking (1994) <doi:10.1017/CBO9780511529443>, the generalized logistic distribution discussed by Ahmad et al. (1988) <doi:10.1016/0022-1694(88)90015-7>, and the generalized Gumbel distribution of Jeong et al. (2014) <doi:10.1007/s00477-014-0865-8>. Penalized likelihood approaches for extreme value estimation follow Martins and Stedinger (2000) <doi:10.1029/1999WR900330> and Coles and Dixon (1999) <doi:10.1023/A:1009905222644>. Selection of r is supported using methods discussed in Bader et al. (2017) <doi:10.1007/s11222-016-9697-3>. The package is intended for hydrological, climatological, and environmental extreme value analysis. |
| Authors: | Yire Shin [aut, cre] (ORCID: <https://orcid.org/0000-0003-1297-5430>), Jeong-Soo Park [aut, ths] (ORCID: <https://orcid.org/0000-0002-8460-4869>) |
| Maintainer: | Yire Shin <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.0 |
| Built: | 2026-05-29 10:42:16 UTC |
| Source: | https://github.com/yire-shin/evmr |
Annual top five daily rainfall events recorded in Bangkok, Thailand, from 1961 to 2018. The dataset contains the five largest daily rainfall amounts observed each year.
bangkokbangkok
A data frame with 58 rows and 5 columns:
Largest daily rainfall in the year (mm)
Second largest daily rainfall (mm)
Third largest daily rainfall (mm)
Fourth largest daily rainfall (mm)
Fifth largest daily rainfall (mm)
The data are commonly used for extreme value analysis based on r-largest order statistics.
Each row corresponds to one year from 1961 to 2018 and contains the five largest daily rainfall observations recorded in that year.
Rain gauge station records from Bangkok, Thailand.
Shin, Y and Park, J-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics.
data(bangkok) head(bangkok)data(bangkok) head(bangkok)
Annual r-largest stream flow observations from the Bevern River in the UK. The dataset contains the three largest daily stream flow values recorded in each year.
bevernbevern
A data frame with 52 rows and 4 columns:
Year of observation
Largest daily stream flow in the year
Second largest daily stream flow
Third largest daily stream flow
This dataset is commonly used for extreme value analysis based on r-largest order statistics.
The data represent annual r-largest daily stream flow observations from the Bevern River. Each row corresponds to one year and contains the three largest daily stream flow measurements recorded in that year.
United Kingdom hydrological records. This is the original data source containing the daily stream flow observations.
Shin, Y. and Park, J.-S. (2024). Generalized logistic model for r-largest order statistics, with hydrological application.
data(bevern) head(bevern)data(bevern) head(bevern)
Fit multiple extreme value models for r-largest order statistics and return a combined summary table including parameter estimates, standard errors, and return levels.
evmr(data, models = c("rk4d", "rglo", "rggd", "rgd", "rld"), num_inits = 100)evmr(data, models = c("rk4d", "rglo", "rggd", "rgd", "rld"), num_inits = 100)
data |
A vector, matrix, or data frame containing r-largest order statistics. |
models |
Character vector specifying models to fit. |
num_inits |
Number of random initial values used in optimization. |
A data frame summarizing fitted models.
## Not run: data(bangkok) evmr(bangkok) ## End(Not run)## Not run: data(bangkok) evmr(bangkok) ## End(Not run)
Annual r-largest daily stream flow observations from the Oykel River in the United Kingdom. The dataset contains the three largest daily stream flow values recorded in each year.
oykeloykel
A data frame with 42 rows and 4 variables:
Year of observation
Largest daily stream flow in the year
Second largest daily stream flow
Third largest daily stream flow
The data are used for extreme value analysis based on r-largest order statistics models.
Each row represents one year and contains the three largest
daily stream flow observations recorded in that year.
Missing observations are represented by NA.
United Kingdom hydrological records. This is the original data source containing the daily stream flow data.
Shin, Y. and Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics, with hydrological application.
data(oykel) head(oykel)data(oykel) head(oykel)
Computes the quantiles of the Gumbel distribution with location
parameter loc and scale parameter scale.
qgd(p, loc = 0, scale = 1)qgd(p, loc = 0, scale = 1)
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
The quantile function of the Gumbel distribution is
where is the location parameter and
is the scale parameter.
A numeric vector of quantiles corresponding to p.
qgd(0.5, loc = 0, scale = 1) qgd(c(0.1, 0.5, 0.9), loc = 0, scale = 1)qgd(0.5, loc = 0, scale = 1) qgd(c(0.1, 0.5, 0.9), loc = 0, scale = 1)
Computes the quantiles of the generalized Gumbel distribution
with location parameter loc, scale parameter scale,
and shape parameter shape.
qggd(p, loc = 0, scale = 1, shape = 0)qggd(p, loc = 0, scale = 1, shape = 0)
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
The quantile function is computed as
with the limiting case
where is the location parameter, is the
scale parameter, and is the shape parameter.
A numeric vector of quantiles corresponding to p.
Jeong, B.-Y., Murshed, M. S., Seo, Y. A., and Park, J.-S. (2014). A three-parameter kappa distribution with hydrologic application: a generalized Gumbel distribution. Stochastic Environmental Research and Risk Assessment, 28(8), 2063–2074.
qggd(0.5, loc = 0, scale = 1, shape = 0.1) qggd(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)qggd(0.5, loc = 0, scale = 1, shape = 0.1) qggd(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)
Computes the quantiles of the generalized logistic distribution
with location parameter loc, scale parameter scale,
and shape parameter shape.
qglo(p, loc = 0, scale = 1, shape = 0)qglo(p, loc = 0, scale = 1, shape = 0)
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
The quantile function is computed as
with the limiting case
where is the location parameter, is the
scale parameter, and is the shape parameter.
A numeric vector of quantiles corresponding to p.
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
qglo(0.5, loc = 0, scale = 1, shape = 0.1) qglo(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)qglo(0.5, loc = 0, scale = 1, shape = 0.1) qglo(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)
Computes the quantiles of the four-parameter kappa distribution
with location parameter loc, scale parameter scale,
and shape parameters shape1 and shape2.
qk4d(p, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)qk4d(p, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape1 |
A numeric value specifying the first shape parameter. |
shape2 |
A numeric value specifying the second shape parameter. |
The quantile function of the four-parameter kappa distribution is
where is the location parameter, is the
scale parameter, and and are shape parameters.
For numerical stability, the limiting cases and/or
are handled separately.
A numeric vector of quantiles corresponding to p.
Shin, Y., and Park, J.-S.(2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
Hosking, J. R. M. (1994). The four-parameter Kappa distribution. Cambridge University Press.
qk4d(0.5, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1) qk4d(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)qk4d(0.5, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1) qk4d(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
Computes the quantiles of the logistic distribution with location
parameter loc and scale parameter scale.
qld(p, loc = 0, scale = 1)qld(p, loc = 0, scale = 1)
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
The quantile function of the logistic distribution is
where is the location parameter and
is the scale parameter.
A numeric vector of quantiles corresponding to p.
qld(0.5, loc = 0, scale = 1) qld(c(0.1, 0.5, 0.9), loc = 0, scale = 1)qld(0.5, loc = 0, scale = 1) qld(c(0.1, 0.5, 0.9), loc = 0, scale = 1)
Fits the Gumbel distribution to -largest order statistics using
maximum likelihood estimation. Stationary and non-stationary models are
supported through generalized linear modelling of the location and scale
parameters.
rgd.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )rgd.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary modelling
of the parameters, or |
mul, sigl
|
Integer vectors indicating which columns of |
mulink, siglink
|
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit
|
Numeric vectors giving initial values for the location
and scale parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character string describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. A value of 0
indicates successful convergence for |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix. |
se |
The estimated standard errors. |
vals |
A matrix containing fitted values of the location and scale parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2) fit <- rgd.fit(x$rmat) fit$r fit$mlex <- rgdr(n = 50, r = 2, loc = 10, scale = 2) fit <- rgd.fit(x$rmat) fit$r fit$mle
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest Gumbel distribution model fitted by rgd.fit().
rgd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)rgd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup
|
The lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
## Not run: x <- rgdr(n = 50, r = 2, loc = 10, scale = 2) fit <- rgd.fit(x$rmat) rgd.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)## Not run: x <- rgdr(n = 50, r = 2, loc = 10, scale = 2) fit <- rgd.fit(x$rmat) rgd.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)
Computes return levels and their standard errors for a stationary
Gumbel model fitted by rgd.fit.
rgd.rl(z, year = c(20, 50, 100, 200), show = FALSE)rgd.rl(z, year = c(20, 50, 100, 200), show = FALSE)
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
For a return period , the return level is defined as the quantile
exceeded with probability . Under the Gumbel distribution, the
return level is
Standard errors are obtained using the delta method.
The input object z with two additional components:
rl |
A numeric vector of estimated return levels. |
rlse |
A numeric vector of standard errors of the estimated return levels. |
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2) fit <- rgd.fit(x$rmat) out <- rgd.rl(fit, year = c(20, 50, 100)) out$rl out$rlsex <- rgdr(n = 50, r = 2, loc = 10, scale = 2) fit <- rgd.fit(x$rmat) out <- rgd.rl(fit, year = c(20, 50, 100)) out$rl out$rlse
Summarizes fitted Gumbel distribution models for r-largest order
statistics over . For each value of r,
the function fits the model using rgd.fit and computes
return levels using rgd.rl.
rgd.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )rgd.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl
|
Integer vectors indicating which columns of |
mulink, siglink
|
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit
|
Optional initial values for the location and scale parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
A data frame containing:
r: number of order statistics used
nllh: negative log-likelihood
mu, sigma: parameter estimates
mu.se, sigma.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se:
standard errors of return levels
x <- rgdr(n = 50, r = 3, loc = 10, scale = 2) rgd.summary(x$rmat)x <- rgdr(n = 50, r = 3, loc = 10, scale = 2) rgd.summary(x$rmat)
Generates random samples from the Gumbel distribution for
-largest order statistics.
rgdr(n, r, loc = 0, scale = 1)rgdr(n, r, loc = 0, scale = 1)
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
The function first generates independent uniform random variables and then
constructs decreasing variables through cumulative products. These are
transformed using the Gumbel quantile function qgd.
A list with components:
umat |
An |
wmat |
An |
rmat |
An |
x <- rgdr(10, 3, loc = 0, scale = 1) x$rmatx <- rgdr(10, 3, loc = 0, scale = 1) x$rmat
Fits the generalized Gumbel distribution to -largest order statistics
using maximum likelihood estimation. Stationary and non-stationary models
are supported through generalized linear modelling of the location, scale,
and shape parameters.
rggd.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, hl = NULL, mulink = identity, siglink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, hinit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )rggd.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, hl = NULL, mulink = identity, siglink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, hinit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary modelling
of the parameters, or |
mul, sigl, hl
|
Integer vectors indicating which columns of |
mulink, siglink, hlink
|
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit, hinit
|
Numeric vectors giving initial values for the
location, scale, and shape parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character vector describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix when available. |
se |
The estimated standard errors when available. |
vals |
A matrix containing fitted values of the location, scale, and shape parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rggd.fit(x$rmat) fit$r fit$mlex <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rggd.fit(x$rmat) fit$r fit$mle
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest generalized Gumbel distribution (rGGD) model
fitted by rggd.fit.
rggd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)rggd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup
|
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
## Not run: x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rggd.fit(x$rmat) rggd.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)## Not run: x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rggd.fit(x$rmat) rggd.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)
Computes return levels and their standard errors for a stationary
generalized Gumbel model fitted by rggd.fit.
rggd.rl(z, year = c(20, 50, 100, 200), show = FALSE)rggd.rl(z, year = c(20, 50, 100, 200), show = FALSE)
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
For a return period , the return level is defined as the quantile
exceeded with probability . Under the generalized Gumbel
distribution, the return level is
Standard errors are obtained using the delta method.
The input object z with two additional components:
rl |
A numeric vector of estimated return levels. |
rlse |
A numeric vector of standard errors of the estimated return levels. |
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rggd.fit(x$rmat) out <- rggd.rl(fit, year = c(20, 50, 100)) out$rl out$rlsex <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rggd.fit(x$rmat) out <- rggd.rl(fit, year = c(20, 50, 100)) out$rl out$rlse
Summarizes fitted generalized Gumbel distribution models for
r-largest order statistics over . For each value
of r, the function fits the model using rggd.fit
and computes return levels using rggd.rl.
rggd.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, hl = NULL, mulink = identity, siglink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, hinit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )rggd.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, hl = NULL, mulink = identity, siglink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, hinit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl, hl
|
Integer vectors indicating which columns of
|
mulink, siglink, hlink
|
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit, hinit
|
Optional initial values for the location, scale, and shape parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
A data frame containing:
r: number of order statistics used
nllh: negative log-likelihood
mu, sigma, h: parameter estimates
mu.se, sigma.se, h.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se:
standard errors of return levels
x <- rggdr(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1) rggd.summary(x$rmat)x <- rggdr(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1) rggd.summary(x$rmat)
Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.
rggdEd(data)rggdEd(data)
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest. |
The test compares the entropy of models fitted with and
order statistics and evaluates whether the additional order
statistic provides significant information.
This function fits the rGGD model using rggd.fit and then
computes the entropy difference test statistic by comparing the fitted
likelihood contributions from models with and order
statistics.
A list containing:
statistics: the entropy difference test statistic
p.value: the two-sided p-value
theta: the estimated parameter vector of the rGGD model
ybar: the sample mean entropy difference
Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
## Not run: data(bangkok) rggdEd(bangkok) ## End(Not run)## Not run: data(bangkok) rggdEd(bangkok) ## End(Not run)
Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.
rggdEdtest(data)rggdEdtest(data)
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest. |
The procedure computes ED tests sequentially for and
applies the ForwardStop and StrongStop stopping rules to control the
false discovery rate.
The function sequentially applies the entropy difference test
(rggdEd) for increasing values of .
The columns of data must represent decreasing order statistics
within each row, with the first column containing the block maximum.
The resulting p-values are adjusted using the ForwardStop and StrongStop
procedures to help determine an appropriate value of .
A data frame containing:
r Value of tested
p.values Raw p-values from the entropy difference tests
statistic Test statistics for each value of
est.loc Estimated location parameter
est.scale Estimated scale parameter
est.shape Estimated shape parameter
ybar Mean entropy difference
ForwardStop Adjusted values from the ForwardStop rule
StrongStop Adjusted values from the StrongStop rule
Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
## Not run: data(bangkok) rggdEdtest(bangkok) ## End(Not run)## Not run: data(bangkok) rggdEdtest(bangkok) ## End(Not run)
Computes the negative log-likelihood for the r-largest generalized Gumbel distribution (rGGD) model.
rggdLh(data, par)rggdLh(data, par)
data |
A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics. |
par |
A numeric vector of length 3 giving the location, scale, and shape parameters, respectively. |
This function is intended for internal likelihood evaluation in optimization.
Invalid parameter combinations return Inf rather than stopping with
an error, which makes the function more robust when used inside optimizers
such as optim.
#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
A single numeric value giving the negative log-likelihood.
If the parameter combination is invalid, the function returns Inf.
Generates random samples from the generalized Gumbel distribution for
-largest order statistics.
rggdr(n, r, loc = 0, scale = 1, shape = 0.1)rggdr(n, r, loc = 0, scale = 1, shape = 0.1)
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
The function first generates independent uniform random variables and then
constructs decreasing variables through recursive transformations depending
on the shape parameter. These are transformed using the generalized Gumbel
quantile function qggd.
For valid generation, the shape parameter must satisfy
for , which implies
when .
A list with components:
umat |
An |
wmat |
An |
rmat |
An |
x <- rggdr(10, 3, loc = 0, scale = 1, shape = 0.1) x$rmatx <- rggdr(10, 3, loc = 0, scale = 1, shape = 0.1) x$rmat
Fits the generalized logistic distribution to -largest order
statistics using maximum likelihood estimation. Stationary and
non-stationary models are supported through generalized linear modelling
of the location, scale, and shape parameters.
rglo.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, mulink = identity, siglink = identity, shlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )rglo.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, mulink = identity, siglink = identity, shlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary
modelling of the parameters, or |
mul, sigl, shl
|
Integer vectors indicating which columns of
|
mulink, siglink, shlink
|
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit, shinit
|
Numeric vectors giving initial values for the
location, scale, and shape parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character vector describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix when available. |
se |
The estimated standard errors when available. |
vals |
A matrix containing fitted values of the location, scale, and shape parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rglo.fit(x$rmat, num_inits = 5) fit$r fit$mlex <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rglo.fit(x$rmat, num_inits = 5) fit$r fit$mle
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest generalized logistic distribution (rGLO) model
fitted by rglo.fit.
rglo.prof(z, m, xlow, xup, conf = 0.95, nint = 100)rglo.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup
|
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
## Not run: x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rglo.fit(x$rmat) rglo.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)## Not run: x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rglo.fit(x$rmat) rglo.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)
Computes return levels and their standard errors for a stationary
generalized logistic model fitted by rglo.fit.
rglo.rl(z, year = c(20, 50, 100, 200), show = FALSE)rglo.rl(z, year = c(20, 50, 100, 200), show = FALSE)
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
For a return period , the return level is defined as the quantile
exceeded with probability . Under the generalized logistic
distribution, the return level is
which is equivalently written in the implementation as
Standard errors are obtained using the delta method.
The input object z with two additional components:
rl |
A numeric vector of estimated return levels. |
rlse |
A numeric vector of standard errors of the estimated return levels. |
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rglo.fit(x$rmat) out <- rglo.rl(fit, year = c(20, 50, 100)) out$rl out$rlsex <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1) fit <- rglo.fit(x$rmat) out <- rglo.rl(fit, year = c(20, 50, 100)) out$rl out$rlse
Summarizes fitted generalized logistic distribution models for
r-largest order statistics over . For each value
of r, the function fits the model using rglo.fit
and computes return levels using rglo.rl.
rglo.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, mulink = identity, siglink = identity, shlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )rglo.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, mulink = identity, siglink = identity, shlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl, shl
|
Integer vectors indicating which columns of
|
mulink, siglink, shlink
|
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit, shinit
|
Optional initial values for the location, scale, and shape parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
A data frame containing:
r: number of order statistics used
nllh: negative log-likelihood
mu, sigma, xi: parameter estimates
mu.se, sigma.se, xi.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se:
standard errors of return levels
x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1) rglo.summary(x$rmat, num_inits = 5)x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1) rglo.summary(x$rmat, num_inits = 5)
Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.
rgloEd(data, par = NULL)rgloEd(data, par = NULL)
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest. |
par |
An optional numeric vector of length 3 giving the location,
scale, and shape parameters. If |
The test compares the entropy of models fitted with and
order statistics and evaluates whether the additional order
statistic provides significant information.
This function applies the entropy difference test to the r-largest
generalized logistic model. If par is not supplied, the model
parameters are first estimated using rglo.fit.
A list containing:
statistics: the entropy difference test statistic
p.value: the two-sided p-value
theta: the estimated or supplied parameter vector
ybar: the sample mean entropy difference
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
## Not run: data(bangkok) rgloEd(bangkok) ## End(Not run)## Not run: data(bangkok) rgloEd(bangkok) ## End(Not run)
Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.
rgloEdtest(data, par = NULL)rgloEdtest(data, par = NULL)
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest. |
par |
An optional numeric vector of length 3 giving the location,
scale, and shape parameters. If |
The procedure computes ED tests sequentially for and
applies the ForwardStop and StrongStop stopping rules to control the
false discovery rate.
The function sequentially applies the entropy difference test
(rgloEd) for increasing values of . The resulting
p-values are adjusted using the ForwardStop and StrongStop procedures
to help determine an appropriate value of .
A data frame containing:
r: value of tested
p.values: raw p-values from the entropy difference tests
statistic: test statistics for each value of
est.loc: estimated location parameter
est.scale: estimated scale parameter
est.shape: estimated shape parameter
ybar: mean entropy difference
ForwardStop: adjusted values from the ForwardStop rule
StrongStop: adjusted values from the StrongStop rule
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
## Not run: data(bangkok) rgloEdtest(bangkok) ## End(Not run)## Not run: data(bangkok) rgloEdtest(bangkok) ## End(Not run)
Computes the observation-wise log-likelihood contributions for the r-largest generalized logistic distribution (rGLO) model.
rgloLh(data, par)rgloLh(data, par)
data |
A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics. |
par |
A numeric vector of length 3 giving the location, scale, and shape parameters, respectively. |
This function is mainly intended for internal likelihood evaluation.
Invalid parameter combinations return Inf, which is often more
robust than stopping with an error when used inside iterative procedures.
A numeric vector of log-likelihood contributions, one for each row
of data. If the parameter combination is invalid, the function
returns Inf.
Generates random samples from the generalized logistic distribution for
-largest order statistics.
rglor(n, r, loc = 0, scale = 1, shape = 0.1)rglor(n, r, loc = 0, scale = 1, shape = 0.1)
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
The function first generates independent uniform random variables and then
constructs decreasing variables through recursive transformations. These
are transformed using the generalized logistic quantile function
qglo.
A list with components:
umat |
An |
wmat |
An |
rmat |
An |
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
x <- rglor(10, 3, loc = 0, scale = 1, shape = 0.1) x$rmatx <- rglor(10, 3, loc = 0, scale = 1, shape = 0.1) x$rmat
Fits the four-parameter kappa distribution to -largest order
statistics using maximum likelihood estimation. Stationary and
non-stationary models are supported through generalized linear modelling
of the location, scale, and two shape parameters.
rk4d.fit( xdat, r = NULL, penk = NULL, penh = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, hl = NULL, mulink = identity, siglink = identity, shlink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, hinit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )rk4d.fit( xdat, r = NULL, penk = NULL, penh = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, hl = NULL, mulink = identity, siglink = identity, shlink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, hinit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
penk |
Optional penalty for the first shape parameter. Supported values
include |
penh |
Optional penalty for the second shape parameter. Supported values
include |
ydat |
A matrix or data frame of covariates for non-stationary
modelling of the parameters, or |
mul, sigl, shl, hl
|
Integer vectors indicating which columns of
|
mulink, siglink, shlink, hlink
|
Inverse link functions for the location, scale, first shape, and second shape parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit, shinit, hinit
|
Numeric vectors giving initial values for
the location, scale, first shape, and second shape parameters. If
|
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character vector describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix when available. |
se |
The estimated standard errors when available. |
vals |
A matrix containing fitted values of the location, scale, first shape, and second shape parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Martins, E. S., & Stedinger, J. R. (2000). Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737–744. doi:10.1029/1999WR900330
Coles, S., & Dixon, M. (1999). Likelihood-based inference for extreme value models. Extremes, 2(1), 5–23. doi:10.1023/A:1009905222644
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) fit <- rk4d.fit(x$rmat, num_inits = 5) fit$r fit$mlex <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) fit <- rk4d.fit(x$rmat, num_inits = 5) fit$r fit$mle
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest four-parameter kappa distribution (rK4D) model
fitted by rk4d.fit.
rk4d.prof(z, m, xlow, xup, conf = 0.95, nint = 100)rk4d.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup
|
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
## Not run: x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) fit <- rk4d.fit(x$rmat) rk4d.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)## Not run: x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) fit <- rk4d.fit(x$rmat) rk4d.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)
Computes return levels and their standard errors for a stationary
four-parameter kappa model fitted by rk4d.fit.
rk4d.rl(z, year = c(20, 50, 100, 200), show = FALSE)rk4d.rl(z, year = c(20, 50, 100, 200), show = FALSE)
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
For a return period , the return level is defined as the quantile
exceeded with probability . Under the four-parameter kappa
distribution, the return level is
and standard errors are obtained using the delta method.
The input object z with two additional components:
rl: a numeric vector of estimated return levels
rlse: a numeric vector of standard errors of the estimated return levels
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) fit <- rk4d.fit(x$rmat, num_inits = 5) out <- rk4d.rl(fit, year = c(20, 50, 100)) out$rl out$rlsex <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) fit <- rk4d.fit(x$rmat, num_inits = 5) out <- rk4d.rl(fit, year = c(20, 50, 100)) out$rl out$rlse
Summarizes fitted four-parameter kappa distribution models for
r-largest order statistics over . For each value
of r, the function fits the model using rk4d.fit
and computes return levels using rk4d.rl.
rk4d.summary( data, r = NULL, penk = NULL, penh = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, hl = NULL, mulink = identity, siglink = identity, shlink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, hinit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )rk4d.summary( data, r = NULL, penk = NULL, penh = NULL, ydat = NULL, mul = NULL, sigl = NULL, shl = NULL, hl = NULL, mulink = identity, siglink = identity, shlink = identity, hlink = identity, num_inits = 100, muinit = NULL, siginit = NULL, shinit = NULL, hinit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
penk |
Penalty function for the |
penh |
Penalty function for the |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl, shl, hl
|
Integer vectors indicating which columns of
|
mulink, siglink, shlink, hlink
|
Inverse link functions for the location, scale, first shape, and second shape parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit, shinit, hinit
|
Optional initial values for the location, scale, first shape, and second shape parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
A data frame containing:
r: number of order statistics used
nllh: negative log-likelihood
mu, sigma, xi, h: parameter estimates
mu.se, sigma.se, xi.se, h.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se:
standard errors of return levels
x <- rk4dr(n = 50, r = 3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) rk4d.summary(x$rmat, num_inits = 5) rk4d.summary(x$rmat, penk = "CD", penh = "MS", num_inits = 5)x <- rk4dr(n = 50, r = 3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1) rk4d.summary(x$rmat, num_inits = 5) rk4d.summary(x$rmat, penk = "CD", penh = "MS", num_inits = 5)
Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.
rk4dEd(data)rk4dEd(data)
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest. |
The test compares the entropy of models fitted with and
order statistics and evaluates whether the additional order
statistic provides significant information.
This function fits the rK4D model using rk4d.fit and then
computes the entropy difference test statistic by comparing the fitted
likelihood contributions from models with and order
statistics.
A list containing:
statistics: the entropy difference test statistic
p.value: the two-sided p-value
theta: the estimated parameter vector of the rK4D model
ybar: the sample mean entropy difference
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., Park, J.-S., and coauthors (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
## Not run: data(bangkok) rk4dEd(bangkok) ## End(Not run)## Not run: data(bangkok) rk4dEd(bangkok) ## End(Not run)
Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.
rk4dEdtest(data)rk4dEdtest(data)
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest. |
The procedure computes ED tests sequentially for and
applies the ForwardStop and StrongStop stopping rules to control the
false discovery rate.
The function sequentially applies the entropy difference test
(rk4dEd) for increasing values of . The resulting
p-values are adjusted using the ForwardStop and StrongStop procedures
to help determine an appropriate value of .
A data frame containing:
r: value of tested
p.values: raw p-values from the entropy difference tests
statistic: test statistics for each value of
est.loc: estimated location parameter
est.scale: estimated scale parameter
est.shape1: estimated first shape parameter
est.shape2: estimated second shape parameter
ybar: mean entropy difference
ForwardStop: adjusted values from the ForwardStop rule
StrongStop: adjusted values from the StrongStop rule
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
## Not run: data(bangkok) rk4dEdtest(bangkok) ## End(Not run)## Not run: data(bangkok) rk4dEdtest(bangkok) ## End(Not run)
Computes the observation-wise log-likelihood contributions for the r-largest four-parameter kappa distribution (rK4D) model.
rk4dLh(data, par)rk4dLh(data, par)
data |
A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics. |
par |
A numeric vector of length 4 giving the location, scale, first shape, and second shape parameters. |
A numeric vector of log-likelihood contributions for each row
of data. If invalid parameter combinations occur, the function
returns a large penalty value.
Generates random samples from the four-parameter kappa distribution for
-largest order statistics.
rk4dr(n, r, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)rk4dr(n, r, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape1 |
A numeric value specifying the first shape parameter. |
shape2 |
A numeric value specifying the second shape parameter. |
The function first generates independent uniform random variables and then
constructs decreasing transformed variables recursively using the second
shape parameter. These are transformed by the four-parameter kappa quantile
function qk4d.
For valid generation with , the second shape parameter should
satisfy .
A list with components:
umat: an n x r matrix of independent uniform random numbers
wmat: an n x r matrix of transformed uniform variables
rmat: an n x r matrix of simulated -largest order statistics
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
x <- rk4dr(10, 3, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1) x$rmatx <- rk4dr(10, 3, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1) x$rmat
Fits the logistic distribution to -largest order statistics
using maximum likelihood estimation. Stationary and non-stationary models
are supported through generalized linear modelling of the location and
scale parameters.
rld.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )rld.fit( xdat, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = TRUE, method = "Nelder-Mead", maxit = 10000, ... )
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary
modelling of the parameters, or |
mul, sigl
|
Integer vectors indicating which columns of
|
mulink, siglink
|
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit
|
Numeric vectors giving initial values for the
location and scale parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
A list with components including:
trans: logical; TRUE if a non-stationary model is fitted
model: a list containing mul and sigl
link: a character vector describing the inverse link functions
conv: the convergence code returned by the optimizer
nllh: the negative log-likelihood evaluated at the fitted parameters
data: the data used in the fit
mle: the maximum likelihood estimates
cov: the estimated covariance matrix when available
se: the estimated standard errors when available
vals: a matrix containing fitted values of the location and scale
r: the number of order statistics used in the fitted model
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
x <- rldr(n = 50, r = 2, loc = 10, scale = 2) fit <- rld.fit(x$rmat, num_inits = 5) fit$r fit$mlex <- rldr(n = 50, r = 2, loc = 10, scale = 2) fit <- rld.fit(x$rmat, num_inits = 5) fit$r fit$mle
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest logistic distribution (rLD) model fitted by
rld.fit.
rld.prof(z, m, xlow, xup, conf = 0.95, nint = 100)rld.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup
|
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
## Not run: x <- rldr(n = 50, r = 2, loc = 10, scale = 2) fit <- rld.fit(x$rmat) rld.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)## Not run: x <- rldr(n = 50, r = 2, loc = 10, scale = 2) fit <- rld.fit(x$rmat) rld.prof(fit, m = 100, xlow = 12, xup = 25) ## End(Not run)
Computes return levels and their standard errors for a stationary
logistic model fitted by rld.fit.
rld.rl(z, year = c(20, 50, 100, 200), show = FALSE)rld.rl(z, year = c(20, 50, 100, 200), show = FALSE)
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
For a return period , the return level is defined as the quantile
exceeded with probability . Under the logistic distribution,
the return level is
and standard errors are obtained using the delta method.
The input object z with two additional components:
rl: a numeric vector of estimated return levels
rlse: a numeric vector of standard errors of the estimated return levels
x <- rldr(n = 50, r = 2, loc = 10, scale = 2) fit <- rld.fit(x$rmat, num_inits = 5) out <- rld.rl(fit, year = c(20, 50, 100)) out$rl out$rlsex <- rldr(n = 50, r = 2, loc = 10, scale = 2) fit <- rld.fit(x$rmat, num_inits = 5) out <- rld.rl(fit, year = c(20, 50, 100)) out$rl out$rlse
Summarizes fitted logistic distribution models for r-largest order
statistics over . For each value of r,
the function fits the model using rld.fit and computes
return levels using rld.rl.
rld.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )rld.summary( data, r = NULL, ydat = NULL, mul = NULL, sigl = NULL, mulink = identity, siglink = identity, num_inits = 100, muinit = NULL, siginit = NULL, show = FALSE, method = "Nelder-Mead", maxit = 10000, ... )
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl
|
Integer vectors indicating which columns of
|
mulink, siglink
|
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit
|
Optional initial values for the location and scale parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
A data frame containing:
r: number of order statistics used
nllh: negative log-likelihood
mu, sigma: parameter estimates
mu.se, sigma.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se:
standard errors of return levels
x <- rldr(n = 50, r = 3, loc = 10, scale = 2) rld.summary(x$rmat, num_inits = 5)x <- rldr(n = 50, r = 3, loc = 10, scale = 2) rld.summary(x$rmat, num_inits = 5)
Generates random samples from the logistic distribution for
-largest order statistics.
rldr(n, r, loc = 0, scale = 1)rldr(n, r, loc = 0, scale = 1)
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
The function first generates independent uniform random variables and then
constructs decreasing transformed variables recursively. These are
transformed by the logistic quantile function qld.
A list with components:
umat: an n x r matrix of independent uniform random numbers
wmat: an n x r matrix of transformed uniform variables
rmat: an n x r matrix of simulated -largest order statistics
x <- rldr(10, 3, loc = 0, scale = 1) x$rmatx <- rldr(10, 3, loc = 0, scale = 1) x$rmat