Package 'evmr' reference manual

Title:	Extreme Value Modeling for r-Largest Order Statistics
Description:	Tools for extreme value modeling based on the r-largest order statistics framework. The package provides functions for parameter estimation via maximum likelihood, return level estimation with standard errors, profile likelihood-based confidence intervals, random sample generation, and entropy difference tests for selecting the number of order statistics r. Several r-largest order statistics models are implemented, including the four-parameter kappa (rK4D), generalized logistic (rGLO), generalized Gumbel (rGGD), logistic (rLD), and Gumbel (rGD) distributions. The rK4D methodology is described in Shin et al. (2022) <doi:10.1016/j.wace.2022.100533>, the rGLO model in Shin and Park (2024) <doi:10.1007/s00477-023-02642-7>, and the rGGD model in Shin and Park (2025) <doi:10.1038/s41598-024-83273-y>. The underlying distributions are related to the kappa distribution of Hosking (1994) <doi:10.1017/CBO9780511529443>, the generalized logistic distribution discussed by Ahmad et al. (1988) <doi:10.1016/0022-1694(88)90015-7>, and the generalized Gumbel distribution of Jeong et al. (2014) <doi:10.1007/s00477-014-0865-8>. Penalized likelihood approaches for extreme value estimation follow Martins and Stedinger (2000) <doi:10.1029/1999WR900330> and Coles and Dixon (1999) <doi:10.1023/A:1009905222644>. Selection of r is supported using methods discussed in Bader et al. (2017) <doi:10.1007/s11222-016-9697-3>. The package is intended for hydrological, climatological, and environmental extreme value analysis.
Authors:	Yire Shin [aut, cre] (ORCID: <https://orcid.org/0000-0003-1297-5430>), Jeong-Soo Park [aut, ths] (ORCID: <https://orcid.org/0000-0002-8460-4869>)
Maintainer:	Yire Shin <[email protected]>
License:	GPL-3
Version:	0.1.0
Built:	2026-05-29 10:42:16 UTC
Source:	https://github.com/yire-shin/evmr

Bangkok Rainfall Data

Description

Annual top five daily rainfall events recorded in Bangkok, Thailand, from 1961 to 2018. The dataset contains the five largest daily rainfall amounts observed each year.

Usage

bangkok
bangkok

Format

A data frame with 58 rows and 5 columns:

X1: Largest daily rainfall in the year (mm)
X2: Second largest daily rainfall (mm)
X3: Third largest daily rainfall (mm)
X4: Fourth largest daily rainfall (mm)
X5: Fifth largest daily rainfall (mm)

Details

The data are commonly used for extreme value analysis based on r-largest order statistics.

Each row corresponds to one year from 1961 to 2018 and contains the five largest daily rainfall observations recorded in that year.

Source

Rain gauge station records from Bangkok, Thailand.

References

Shin, Y and Park, J-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics.

Examples

data(bangkok)
head(bangkok)

data(bangkok)
head(bangkok)

Bevern Stream Flow Data

Description

Annual r-largest stream flow observations from the Bevern River in the UK. The dataset contains the three largest daily stream flow values recorded in each year.

Usage

bevern
bevern

Format

A data frame with 52 rows and 4 columns:

Year: Year of observation
r1: Largest daily stream flow in the year
r2: Second largest daily stream flow
r3: Third largest daily stream flow

Details

This dataset is commonly used for extreme value analysis based on r-largest order statistics.

The data represent annual r-largest daily stream flow observations from the Bevern River. Each row corresponds to one year and contains the three largest daily stream flow measurements recorded in that year.

Source

United Kingdom hydrological records. This is the original data source containing the daily stream flow observations.

References

Shin, Y. and Park, J.-S. (2024). Generalized logistic model for r-largest order statistics, with hydrological application.

Examples

data(bevern)
head(bevern)

data(bevern)
head(bevern)

Fit and Compare r-Largest Order Statistics Models

Description

Fit multiple extreme value models for r-largest order statistics and return a combined summary table including parameter estimates, standard errors, and return levels.

Usage

evmr(data, models = c("rk4d", "rglo", "rggd", "rgd", "rld"), num_inits = 100)
evmr(data, models = c("rk4d", "rglo", "rggd", "rgd", "rld"), num_inits = 100)

Arguments

data

A vector, matrix, or data frame containing r-largest order statistics.

models

Character vector specifying models to fit.

num_inits

Number of random initial values used in optimization.

Value

A data frame summarizing fitted models.

Examples

## Not run: 
data(bangkok)
evmr(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
evmr(bangkok)

## End(Not run)

Oykel River Stream Flow Data

Description

Annual r-largest daily stream flow observations from the Oykel River in the United Kingdom. The dataset contains the three largest daily stream flow values recorded in each year.

Usage

oykel
oykel

Format

A data frame with 42 rows and 4 variables:

Year: Year of observation
r1: Largest daily stream flow in the year
r2: Second largest daily stream flow
r3: Third largest daily stream flow

Details

The data are used for extreme value analysis based on r-largest order statistics models.

Each row represents one year and contains the three largest daily stream flow observations recorded in that year. Missing observations are represented by NA.

Source

United Kingdom hydrological records. This is the original data source containing the daily stream flow data.

References

Shin, Y. and Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics, with hydrological application.

Examples

data(oykel)
head(oykel)

data(oykel)
head(oykel)

Quantile Function of the Gumbel Distribution

Description

Computes the quantiles of the Gumbel distribution with location parameter loc and scale parameter scale.

Usage

qgd(p, loc = 0, scale = 1)
qgd(p, loc = 0, scale = 1)

Arguments

p

A numeric vector of probabilities in $(0,1)$ .

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The quantile function of the Gumbel distribution is

$Q(p) = \mu - \sigma \log(-\log(p)),$

where $\mu$ is the location parameter and $\sigma > 0$ is the scale parameter.

Value

A numeric vector of quantiles corresponding to p.

Examples

qgd(0.5, loc = 0, scale = 1)
qgd(c(0.1, 0.5, 0.9), loc = 0, scale = 1)
qgd(0.5, loc = 0, scale = 1)
qgd(c(0.1, 0.5, 0.9), loc = 0, scale = 1)

Quantile Function of the Generalized Gumbel Distribution

Description

Computes the quantiles of the generalized Gumbel distribution with location parameter loc, scale parameter scale, and shape parameter shape.

Usage

qggd(p, loc = 0, scale = 1, shape = 0)
qggd(p, loc = 0, scale = 1, shape = 0)

Arguments

p

A numeric vector of probabilities in $(0,1)$ .

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The quantile function is computed as

$Q(p) = \mu - \sigma \log \left( \frac{1 - p^h}{h} \right), \quad h \neq 0,$

with the limiting case

$Q(p) = \mu - \sigma \log(-\log p), \quad h = 0,$

where $\mu$ is the location parameter, $\sigma > 0$ is the scale parameter, and $h$ is the shape parameter.

Value

A numeric vector of quantiles corresponding to p.

References

Jeong, B.-Y., Murshed, M. S., Seo, Y. A., and Park, J.-S. (2014). A three-parameter kappa distribution with hydrologic application: a generalized Gumbel distribution. Stochastic Environmental Research and Risk Assessment, 28(8), 2063–2074.

Examples

qggd(0.5, loc = 0, scale = 1, shape = 0.1)
qggd(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)
qggd(0.5, loc = 0, scale = 1, shape = 0.1)
qggd(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)

Quantile Function of the Generalized Logistic Distribution

Description

Computes the quantiles of the generalized logistic distribution with location parameter loc, scale parameter scale, and shape parameter shape.

Usage

qglo(p, loc = 0, scale = 1, shape = 0)
qglo(p, loc = 0, scale = 1, shape = 0)

Arguments

p

A numeric vector of probabilities in $(0,1)$ .

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The quantile function is computed as

$Q(p) = \mu + \frac{\sigma}{\xi}\left[1 - \left(\frac{1-p}{p}\right)^{\xi}\right], \quad \xi \neq 0,$

with the limiting case

$Q(p) = \mu - \sigma \log\left(\frac{1-p}{p}\right), \quad \xi = 0,$

where $\mu$ is the location parameter, $\sigma > 0$ is the scale parameter, and $\xi$ is the shape parameter.

Value

A numeric vector of quantiles corresponding to p.

References

Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7

Examples

qglo(0.5, loc = 0, scale = 1, shape = 0.1)
qglo(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)
qglo(0.5, loc = 0, scale = 1, shape = 0.1)
qglo(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)

Quantile Function of the Four-Parameter Kappa Distribution

Description

Computes the quantiles of the four-parameter kappa distribution with location parameter loc, scale parameter scale, and shape parameters shape1 and shape2.

Usage

qk4d(p, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
qk4d(p, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)

Arguments

p

A numeric vector of probabilities in $(0,1)$ .

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape1

A numeric value specifying the first shape parameter.

shape2

A numeric value specifying the second shape parameter.

Details

The quantile function of the four-parameter kappa distribution is

$Q(p) = \mu + \frac{\sigma}{\xi}\left[1 - \left(\frac{1-p^h}{h}\right)^\xi \right],$

where $\mu$ is the location parameter, $\sigma > 0$ is the scale parameter, and $\xi$ and $h$ are shape parameters.

For numerical stability, the limiting cases $\xi = 0$ and/or $h = 0$ are handled separately.

Value

A numeric vector of quantiles corresponding to p.

References

Shin, Y., and Park, J.-S.(2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Hosking, J. R. M. (1994). The four-parameter Kappa distribution. Cambridge University Press.

Examples

qk4d(0.5, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
qk4d(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
qk4d(0.5, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
qk4d(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)

Quantile Function of the Logistic Distribution

Description

Computes the quantiles of the logistic distribution with location parameter loc and scale parameter scale.

Usage

qld(p, loc = 0, scale = 1)
qld(p, loc = 0, scale = 1)

Arguments

p

A numeric vector of probabilities in $(0,1)$ .

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The quantile function of the logistic distribution is

$Q(p) = \mu + \sigma \log\left(\frac{p}{1-p}\right),$

where $\mu$ is the location parameter and $\sigma > 0$ is the scale parameter.

Value

A numeric vector of quantiles corresponding to p.

Examples

qld(0.5, loc = 0, scale = 1)
qld(c(0.1, 0.5, 0.9), loc = 0, scale = 1)
qld(0.5, loc = 0, scale = 1)
qld(c(0.1, 0.5, 0.9), loc = 0, scale = 1)

Fit the Gumbel Distribution to r-Largest Order Statistics

Description

Fits the Gumbel distribution to $r$ -largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location and scale parameters.

Usage

rgd.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rgd.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

A numeric vector, matrix, or data frame of observations. Each row should contain decreasing order statistics for a given year or block. The first column therefore contains block maxima. Only the first r columns are used in the fitted model. If r is NULL, all available columns are used. If some rows contain fewer order statistics than others, missing values should be appended at the end of the corresponding rows.

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl

Integer vectors indicating which columns of ydat are used as covariates for the location and scale parameters, respectively. Use NULL for stationary parameters.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit

Numeric vectors giving initial values for the location and scale parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul and sigl.

link

A character string describing the inverse link functions.

conv

The convergence code returned by the optimizer. A value of 0 indicates successful convergence for optim.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix.

se

The estimated standard errors.

vals

A matrix containing fitted values of the location and scale parameters at each observation.

r

The number of order statistics used in the fitted model.

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Examples

x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
fit$r
fit$mle
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
fit$r
fit$mle

Profile Likelihood for Return Levels under the rGD Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest Gumbel distribution model fitted by rgd.fit().

Usage

rgd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
rgd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rgd.fit. The fitted model must be stationary.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability $1/m$ .

xlow, xup

The lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

The function evaluates the profile log-likelihood over a grid of return level values and plots the resulting curve. Horizontal and vertical lines are added to indicate profile likelihood confidence intervals for the confidence levels specified in conf.

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

Examples

## Not run: 
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
rgd.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)
## Not run: 
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
rgd.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)

Return Levels for the Gumbel Distribution

Description

Computes return levels and their standard errors for a stationary Gumbel model fitted by rgd.fit.

Usage

rgd.rl(z, year = c(20, 50, 100, 200), show = FALSE)
rgd.rl(z, year = c(20, 50, 100, 200), show = FALSE)

Arguments

z

An object returned by rgd.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period $T$ , the return level is defined as the quantile exceeded with probability $1/T$ . Under the Gumbel distribution, the return level is

$x_T = \mu - \sigma \log\{-\log(1 - 1/T)\}.$

Standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl

A numeric vector of estimated return levels.

rlse

A numeric vector of standard errors of the estimated return levels.

Examples

x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
out <- rgd.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
out <- rgd.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse

Summary of Fitted rGD Models over Different Values of r

Description

Summarizes fitted Gumbel distribution models for r-largest order statistics over $r = 1, \dots, R$ . For each value of r, the function fits the model using rgd.fit and computes return levels using rgd.rl.

Usage

rgd.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rgd.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl

Integer vectors indicating which columns of ydat are used as covariates for the location and scale parameters, respectively.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit

Optional initial values for the location and scale parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rgd.fit.

Value

A data frame containing:

r: number of order statistics used
nllh: negative log-likelihood
mu, sigma: parameter estimates
mu.se, sigma.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se: standard errors of return levels

Examples

x <- rgdr(n = 50, r = 3, loc = 10, scale = 2)
rgd.summary(x$rmat)
x <- rgdr(n = 50, r = 3, loc = 10, scale = 2)
rgd.summary(x$rmat)

Random Generation from the Gumbel Distribution for r-Largest Order Statistics

Description

Generates random samples from the Gumbel distribution for $r$ -largest order statistics.

Usage

rgdr(n, r, loc = 0, scale = 1)
rgdr(n, r, loc = 0, scale = 1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing variables through cumulative products. These are transformed using the Gumbel quantile function qgd.

Value

A list with components:

umat

An n x r matrix of independent uniform random numbers.

wmat

An n x r matrix of transformed uniform variables used to construct decreasing order statistics.

rmat

An n x r matrix of simulated $r$ -largest order statistics from the Gumbel distribution.

Examples

x <- rgdr(10, 3, loc = 0, scale = 1)
x$rmat
x <- rgdr(10, 3, loc = 0, scale = 1)
x$rmat

Fit the Generalized Gumbel Distribution to r-Largest Order Statistics

Description

Fits the generalized Gumbel distribution to $r$ -largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location, scale, and shape parameters.

Usage

rggd.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  hinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rggd.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  hinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl, hl

Integer vectors indicating which columns of ydat are used as covariates for the location, scale, and shape parameters, respectively. Use NULL for stationary parameters.

mulink, siglink, hlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit, hinit

Numeric vectors giving initial values for the location, scale, and shape parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul, sigl, and hl.

link

A character vector describing the inverse link functions.

conv

The convergence code returned by the optimizer.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix when available.

se

The estimated standard errors when available.

vals

A matrix containing fitted values of the location, scale, and shape parameters at each observation.

r

The number of order statistics used in the fitted model.

#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Examples

x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
fit$r
fit$mle
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
fit$r
fit$mle

Profile Likelihood for Return Levels under the rGGD Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest generalized Gumbel distribution (rGGD) model fitted by rggd.fit.

Usage

rggd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
rggd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rggd.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability $1/m$ .

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

Examples

## Not run: 
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
rggd.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)
## Not run: 
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
rggd.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)

Return Levels for the Generalized Gumbel Distribution

Description

Computes return levels and their standard errors for a stationary generalized Gumbel model fitted by rggd.fit.

Usage

rggd.rl(z, year = c(20, 50, 100, 200), show = FALSE)
rggd.rl(z, year = c(20, 50, 100, 200), show = FALSE)

Arguments

z

An object returned by rggd.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period $T$ , the return level is defined as the quantile exceeded with probability $1/T$ . Under the generalized Gumbel distribution, the return level is

$x_T = \mu - \sigma \log\left(\frac{1-(1-1/T)^h}{h}\right), \quad h \neq 0.$

Standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl

A numeric vector of estimated return levels.

rlse

A numeric vector of standard errors of the estimated return levels.

Examples

x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
out <- rggd.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
out <- rggd.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse

Summary of Fitted rGGD Models over Different Values of r

Description

Summarizes fitted generalized Gumbel distribution models for r-largest order statistics over $r = 1, \dots, R$ . For each value of r, the function fits the model using rggd.fit and computes return levels using rggd.rl.

Usage

rggd.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  hinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rggd.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  hinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl, hl

Integer vectors indicating which columns of ydat are used for the location, scale, and shape parameters, respectively.

mulink, siglink, hlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit, hinit

Optional initial values for the location, scale, and shape parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rggd.fit.

Value

A data frame containing:

r: number of order statistics used
nllh: negative log-likelihood
mu, sigma, h: parameter estimates
mu.se, sigma.se, h.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se: standard errors of return levels

Examples

x <- rggdr(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
rggd.summary(x$rmat)
x <- rggdr(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
rggd.summary(x$rmat)

Entropy Difference Test for rGGD Models

Description

Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.

Usage

rggdEd(data)
rggdEd(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest.

Details

The test compares the entropy of models fitted with $r$ and $r-1$ order statistics and evaluates whether the additional order statistic provides significant information.

This function fits the rGGD model using rggd.fit and then computes the entropy difference test statistic by comparing the fitted likelihood contributions from models with $r$ and $r-1$ order statistics.

Value

A list containing:

statistics: the entropy difference test statistic
p.value: the two-sided p-value
theta: the estimated parameter vector of the rGGD model
ybar: the sample mean entropy difference

References

Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of $r$ for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Examples

## Not run: 
data(bangkok)
rggdEd(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
rggdEd(bangkok)

## End(Not run)

Sequential Entropy Difference Test for rGGD Models

Description

Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.

Usage

rggdEdtest(data)
rggdEdtest(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest.

Details

The procedure computes ED tests sequentially for $r = 2, \dots, R$ and applies the ForwardStop and StrongStop stopping rules to control the false discovery rate.

The function sequentially applies the entropy difference test (rggdEd) for increasing values of $r$ . The columns of data must represent decreasing order statistics within each row, with the first column containing the block maximum. The resulting p-values are adjusted using the ForwardStop and StrongStop procedures to help determine an appropriate value of $r$ .

Value

A data frame containing:

r Value of $r$ tested
p.values Raw p-values from the entropy difference tests
statistic Test statistics for each value of $r$
est.loc Estimated location parameter
est.scale Estimated scale parameter
est.shape Estimated shape parameter
ybar Mean entropy difference
ForwardStop Adjusted values from the ForwardStop rule
StrongStop Adjusted values from the StrongStop rule

References

Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of $r$ for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Examples

## Not run: 
data(bangkok)
rggdEdtest(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
rggdEdtest(bangkok)

## End(Not run)

Negative Log-Likelihood for the rGGD Model

Description

Computes the negative log-likelihood for the r-largest generalized Gumbel distribution (rGGD) model.

Usage

rggdLh(data, par)
rggdLh(data, par)

Arguments

data

A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics.

par

A numeric vector of length 3 giving the location, scale, and shape parameters, respectively.

Details

This function is intended for internal likelihood evaluation in optimization. Invalid parameter combinations return Inf rather than stopping with an error, which makes the function more robust when used inside optimizers such as optim.

#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y

Value

A single numeric value giving the negative log-likelihood. If the parameter combination is invalid, the function returns Inf.

Random Generation from the Generalized Gumbel Distribution for r-Largest Order Statistics

Description

Generates random samples from the generalized Gumbel distribution for $r$ -largest order statistics.

Usage

rggdr(n, r, loc = 0, scale = 1, shape = 0.1)
rggdr(n, r, loc = 0, scale = 1, shape = 0.1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing variables through recursive transformations depending on the shape parameter. These are transformed using the generalized Gumbel quantile function qggd.

For valid generation, the shape parameter must satisfy $1 - (j-1)h > 0$ for $j = 2, \dots, r$ , which implies $h < 1/(r-1)$ when $r > 1$ .

Value

A list with components:

umat

An n x r matrix of independent uniform random numbers.

wmat

An n x r matrix of transformed uniform variables used to construct decreasing order statistics.

rmat

An n x r matrix of simulated $r$ -largest order statistics from the generalized Gumbel distribution.

Examples

x <- rggdr(10, 3, loc = 0, scale = 1, shape = 0.1)
x$rmat
x <- rggdr(10, 3, loc = 0, scale = 1, shape = 0.1)
x$rmat

Fit the Generalized Logistic Distribution to r-Largest Order Statistics

Description

Fits the generalized logistic distribution to $r$ -largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location, scale, and shape parameters.

Usage

rglo.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rglo.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl, shl

Integer vectors indicating which columns of ydat are used as covariates for the location, scale, and shape parameters, respectively. Use NULL for stationary parameters.

mulink, siglink, shlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit, shinit

Numeric vectors giving initial values for the location, scale, and shape parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul, sigl, and shl.

link

A character vector describing the inverse link functions.

conv

The convergence code returned by the optimizer.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix when available.

se

The estimated standard errors when available.

vals

A matrix containing fitted values of the location, scale, and shape parameters at each observation.

r

The number of order statistics used in the fitted model.

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7

Examples

x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat, num_inits = 5)
fit$r
fit$mle
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat, num_inits = 5)
fit$r
fit$mle

Profile Likelihood for Return Levels under the rGLO Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest generalized logistic distribution (rGLO) model fitted by rglo.fit.

Usage

rglo.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
rglo.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rglo.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability $1/m$ .

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

Examples

## Not run: 
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
rglo.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)
## Not run: 
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
rglo.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)

Return Levels for the Generalized Logistic Distribution

Description

Computes return levels and their standard errors for a stationary generalized logistic model fitted by rglo.fit.

Usage

rglo.rl(z, year = c(20, 50, 100, 200), show = FALSE)
rglo.rl(z, year = c(20, 50, 100, 200), show = FALSE)

Arguments

z

An object returned by rglo.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period $T$ , the return level is defined as the quantile exceeded with probability $1/T$ . Under the generalized logistic distribution, the return level is

$x_T = \mu + \frac{\sigma}{\xi} \left[1 - \left(\frac{1 - 1/T}{1/T}\right)^{-\xi}\right],$

which is equivalently written in the implementation as

$x_T = \mu + \frac{\sigma}{\xi} - \frac{\sigma}{\xi} \left(\frac{1/T}{1 - 1/T}\right)^{\xi}.$

Standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl

A numeric vector of estimated return levels.

rlse

A numeric vector of standard errors of the estimated return levels.

Examples

x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
out <- rglo.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
out <- rglo.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse

Summary of Fitted rGLO Models over Different Values of r

Description

Summarizes fitted generalized logistic distribution models for r-largest order statistics over $r = 1, \dots, R$ . For each value of r, the function fits the model using rglo.fit and computes return levels using rglo.rl.

Usage

rglo.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rglo.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl, shl

Integer vectors indicating which columns of ydat are used for the location, scale, and shape parameters, respectively.

mulink, siglink, shlink

Inverse link functions for the location, scale, and shape parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit, shinit

Optional initial values for the location, scale, and shape parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rglo.fit.

Value

A data frame containing:

r: number of order statistics used
nllh: negative log-likelihood
mu, sigma, xi: parameter estimates
mu.se, sigma.se, xi.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se: standard errors of return levels

Examples

x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
rglo.summary(x$rmat, num_inits = 5)
x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
rglo.summary(x$rmat, num_inits = 5)

Entropy Difference Test for rGLO Models

Description

Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.

Usage

rgloEd(data, par = NULL)
rgloEd(data, par = NULL)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest.

par

An optional numeric vector of length 3 giving the location, scale, and shape parameters. If NULL, the parameters are estimated using rglo.fit.

Details

The test compares the entropy of models fitted with $r$ and $r-1$ order statistics and evaluates whether the additional order statistic provides significant information.

This function applies the entropy difference test to the r-largest generalized logistic model. If par is not supplied, the model parameters are first estimated using rglo.fit.

Value

A list containing:

statistics: the entropy difference test statistic
p.value: the two-sided p-value
theta: the estimated or supplied parameter vector
ybar: the sample mean entropy difference

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of $r$ for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Examples

## Not run: 
data(bangkok)
rgloEd(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
rgloEd(bangkok)

## End(Not run)

Sequential Entropy Difference Test for rGLO Models

Description

Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.

Usage

rgloEdtest(data, par = NULL)
rgloEdtest(data, par = NULL)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest.

par

An optional numeric vector of length 3 giving the location, scale, and shape parameters. If NULL, parameters are estimated separately at each value of $r$ using rgloEd.

Details

The procedure computes ED tests sequentially for $r = 2, \dots, R$ and applies the ForwardStop and StrongStop stopping rules to control the false discovery rate.

The function sequentially applies the entropy difference test (rgloEd) for increasing values of $r$ . The resulting p-values are adjusted using the ForwardStop and StrongStop procedures to help determine an appropriate value of $r$ .

Value

A data frame containing:

r: value of $r$ tested
p.values: raw p-values from the entropy difference tests
statistic: test statistics for each value of $r$
est.loc: estimated location parameter
est.scale: estimated scale parameter
est.shape: estimated shape parameter
ybar: mean entropy difference
ForwardStop: adjusted values from the ForwardStop rule
StrongStop: adjusted values from the StrongStop rule

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of $r$ for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Examples

## Not run: 
data(bangkok)
rgloEdtest(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
rgloEdtest(bangkok)

## End(Not run)

Log-Likelihood Contributions for the rGLO Model

Description

Computes the observation-wise log-likelihood contributions for the r-largest generalized logistic distribution (rGLO) model.

Usage

rgloLh(data, par)
rgloLh(data, par)

Arguments

data

par

A numeric vector of length 3 giving the location, scale, and shape parameters, respectively.

Details

This function is mainly intended for internal likelihood evaluation. Invalid parameter combinations return Inf, which is often more robust than stopping with an error when used inside iterative procedures.

Value

A numeric vector of log-likelihood contributions, one for each row of data. If the parameter combination is invalid, the function returns Inf.

Random Generation from the Generalized Logistic Distribution for r-Largest Order Statistics

Description

Generates random samples from the generalized logistic distribution for $r$ -largest order statistics.

Usage

rglor(n, r, loc = 0, scale = 1, shape = 0.1)
rglor(n, r, loc = 0, scale = 1, shape = 0.1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape

A numeric value specifying the shape parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing variables through recursive transformations. These are transformed using the generalized logistic quantile function qglo.

Value

A list with components:

umat

An n x r matrix of independent uniform random numbers.

wmat

An n x r matrix of transformed uniform variables used to construct decreasing order statistics.

rmat

An n x r matrix of simulated $r$ -largest order statistics from the generalized logistic distribution.

References

Examples

x <- rglor(10, 3, loc = 0, scale = 1, shape = 0.1)
x$rmat
x <- rglor(10, 3, loc = 0, scale = 1, shape = 0.1)
x$rmat

Fit the Four-Parameter Kappa Distribution to r-Largest Order Statistics

Description

Fits the four-parameter kappa distribution to $r$ -largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location, scale, and two shape parameters.

Usage

rk4d.fit(
  xdat,
  r = NULL,
  penk = NULL,
  penh = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  hinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rk4d.fit(
  xdat,
  r = NULL,
  penk = NULL,
  penh = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  hinit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

penk

Optional penalty for the first shape parameter. Supported values include "CD" and "MS".

penh

Optional penalty for the second shape parameter. Supported values include "MS" and "MSa".

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl, shl, hl

Integer vectors indicating which columns of ydat are used as covariates for the location, scale, first shape, and second shape parameters, respectively.

mulink, siglink, shlink, hlink

Inverse link functions for the location, scale, first shape, and second shape parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit, shinit, hinit

Numeric vectors giving initial values for the location, scale, first shape, and second shape parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans

Logical; TRUE if a non-stationary model is fitted.

model

A list containing mul, sigl, shl, and hl.

link

A character vector describing the inverse link functions.

conv

The convergence code returned by the optimizer.

nllh

The negative log-likelihood evaluated at the fitted parameters.

data

The data used in the fit.

mle

The maximum likelihood estimates.

cov

The estimated covariance matrix when available.

se

The estimated standard errors when available.

vals

A matrix containing fitted values of the location, scale, first shape, and second shape parameters at each observation.

r

The number of order statistics used in the fitted model.

References

Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.

Martins, E. S., & Stedinger, J. R. (2000). Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737–744. doi:10.1029/1999WR900330

Coles, S., & Dixon, M. (1999). Likelihood-based inference for extreme value models. Extremes, 2(1), 5–23. doi:10.1023/A:1009905222644

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
fit$r
fit$mle
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
fit$r
fit$mle

Profile Likelihood for Return Levels under the rK4D Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest four-parameter kappa distribution (rK4D) model fitted by rk4d.fit.

Usage

rk4d.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
rk4d.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rk4d.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability $1/m$ .

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

Examples

## Not run: 
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat)
rk4d.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)
## Not run: 
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat)
rk4d.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)

Return Levels for the Four-Parameter Kappa Distribution

Description

Computes return levels and their standard errors for a stationary four-parameter kappa model fitted by rk4d.fit.

Usage

rk4d.rl(z, year = c(20, 50, 100, 200), show = FALSE)
rk4d.rl(z, year = c(20, 50, 100, 200), show = FALSE)

Arguments

z

An object returned by rk4d.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period $T$ , the return level is defined as the quantile exceeded with probability $1/T$ . Under the four-parameter kappa distribution, the return level is

$x_T = \mu + \frac{\sigma}{\xi} - \frac{\sigma}{\xi} \left(\frac{1-(1-1/T)^h}{h}\right)^\xi,$

and standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl: a numeric vector of estimated return levels
rlse: a numeric vector of standard errors of the estimated return levels

Examples

x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
out <- rk4d.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
out <- rk4d.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse

Summary of Fitted rK4D Models over Different Values of r

Description

Summarizes fitted four-parameter kappa distribution models for r-largest order statistics over $r = 1, \dots, R$ . For each value of r, the function fits the model using rk4d.fit and computes return levels using rk4d.rl.

Usage

rk4d.summary(
  data,
  r = NULL,
  penk = NULL,
  penh = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  hinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rk4d.summary(
  data,
  r = NULL,
  penk = NULL,
  penh = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  shl = NULL,
  hl = NULL,
  mulink = identity,
  siglink = identity,
  shlink = identity,
  hlink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  shinit = NULL,
  hinit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

penk

Penalty function for the xi parameter in maximum penalized likelihood estimation.

penh

Penalty function for the h parameter in maximum penalized likelihood estimation.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl, shl, hl

Integer vectors indicating which columns of ydat are used for the location, scale, first shape, and second shape parameters, respectively.

mulink, siglink, shlink, hlink

Inverse link functions for the location, scale, first shape, and second shape parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit, shinit, hinit

Optional initial values for the location, scale, first shape, and second shape parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rk4d.fit.

Value

A data frame containing:

r: number of order statistics used
nllh: negative log-likelihood
mu, sigma, xi, h: parameter estimates
mu.se, sigma.se, xi.se, h.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se: standard errors of return levels

Examples

x <- rk4dr(n = 50, r = 3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4d.summary(x$rmat, num_inits = 5)
rk4d.summary(x$rmat, penk = "CD", penh = "MS", num_inits = 5)
x <- rk4dr(n = 50, r = 3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4d.summary(x$rmat, num_inits = 5)
rk4d.summary(x$rmat, penk = "CD", penh = "MS", num_inits = 5)

Entropy Difference Test for rK4D Models

Description

Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.

Usage

rk4dEd(data)
rk4dEd(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest.

Details

The test compares the entropy of models fitted with $r$ and $r-1$ order statistics and evaluates whether the additional order statistic provides significant information.

This function fits the rK4D model using rk4d.fit and then computes the entropy difference test statistic by comparing the fitted likelihood contributions from models with $r$ and $r-1$ order statistics.

Value

A list containing:

statistics: the entropy difference test statistic
p.value: the two-sided p-value
theta: the estimated parameter vector of the rK4D model
ybar: the sample mean entropy difference

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of $r$ for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., Park, J.-S., and coauthors (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

## Not run: 
data(bangkok)
rk4dEd(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
rk4dEd(bangkok)

## End(Not run)

Sequential Entropy Difference Test for rK4D Models

Description

Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.

Usage

rk4dEdtest(data)
rk4dEdtest(data)

Arguments

data

A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest.

Details

The procedure computes ED tests sequentially for $r = 2, \dots, R$ and applies the ForwardStop and StrongStop stopping rules to control the false discovery rate.

The function sequentially applies the entropy difference test (rk4dEd) for increasing values of $r$ . The resulting p-values are adjusted using the ForwardStop and StrongStop procedures to help determine an appropriate value of $r$ .

Value

A data frame containing:

r: value of $r$ tested
p.values: raw p-values from the entropy difference tests
statistic: test statistics for each value of $r$
est.loc: estimated location parameter
est.scale: estimated scale parameter
est.shape1: estimated first shape parameter
est.shape2: estimated second shape parameter
ybar: mean entropy difference
ForwardStop: adjusted values from the ForwardStop rule
StrongStop: adjusted values from the StrongStop rule

References

Bader, B., Yan, J., & Zhang, X. (2017). Automated selection of $r$ for the r-largest order statistics approach. Statistics and Computing. doi:10.1007/s11222-016-9697-3

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

## Not run: 
data(bangkok)
rk4dEdtest(bangkok)

## End(Not run)
## Not run: 
data(bangkok)
rk4dEdtest(bangkok)

## End(Not run)

Log-Likelihood Contributions for the rK4D Model

Description

Computes the observation-wise log-likelihood contributions for the r-largest four-parameter kappa distribution (rK4D) model.

Usage

rk4dLh(data, par)
rk4dLh(data, par)

Arguments

data

par

A numeric vector of length 4 giving the location, scale, first shape, and second shape parameters.

Value

A numeric vector of log-likelihood contributions for each row of data. If invalid parameter combinations occur, the function returns a large penalty value.

Random Generation from the Four-Parameter Kappa Distribution for r-Largest Order Statistics

Description

Generates random samples from the four-parameter kappa distribution for $r$ -largest order statistics.

Usage

rk4dr(n, r, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
rk4dr(n, r, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

shape1

A numeric value specifying the first shape parameter.

shape2

A numeric value specifying the second shape parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing transformed variables recursively using the second shape parameter. These are transformed by the four-parameter kappa quantile function qk4d.

For valid generation with $r > 1$ , the second shape parameter should satisfy $shape2 < 1/(r-1)$ .

Value

A list with components:

umat: an n x r matrix of independent uniform random numbers
wmat: an n x r matrix of transformed uniform variables
rmat: an n x r matrix of simulated $r$ -largest order statistics

References

Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533

Examples

x <- rk4dr(10, 3, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
x$rmat
x <- rk4dr(10, 3, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
x$rmat

Fit the Logistic Distribution to r-Largest Order Statistics

Description

Fits the logistic distribution to $r$ -largest order statistics using maximum likelihood estimation. Stationary and non-stationary models are supported through generalized linear modelling of the location and scale parameters.

Usage

rld.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rld.fit(
  xdat,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = TRUE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

xdat

r

The number of largest order statistics to use in the fitted model. If NULL, all columns of xdat are used.

ydat

A matrix or data frame of covariates for non-stationary modelling of the parameters, or NULL for a stationary model. The number of rows must match the number of rows of xdat.

mul, sigl

Integer vectors indicating which columns of ydat are used as covariates for the location and scale parameters, respectively.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

The number of initial parameter sets used in the optimization.

muinit, siginit

Numeric vectors giving initial values for the location and scale parameters. If NULL, default initial values based on L-moments are used.

show

Logical. If TRUE, details of the fitted model are printed.

method

Optimization method passed to optim for stationary fits.

maxit

Maximum number of iterations for optim.

...

Additional control arguments passed to the optimizer.

Value

A list with components including:

trans: logical; TRUE if a non-stationary model is fitted
model: a list containing mul and sigl
link: a character vector describing the inverse link functions
conv: the convergence code returned by the optimizer
nllh: the negative log-likelihood evaluated at the fitted parameters
data: the data used in the fit
mle: the maximum likelihood estimates
cov: the estimated covariance matrix when available
se: the estimated standard errors when available
vals: a matrix containing fitted values of the location and scale
r: the number of order statistics used in the fitted model

References

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.

Examples

x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
fit$r
fit$mle
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
fit$r
fit$mle

Profile Likelihood for Return Levels under the rLD Model

Description

Computes and plots the profile log-likelihood for a return level under a stationary r-largest logistic distribution (rLD) model fitted by rld.fit.

Usage

rld.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
rld.prof(z, m, xlow, xup, conf = 0.95, nint = 100)

Arguments

z

An object returned by rld.fit. The fitted model must represent a stationary model.

m

A return period greater than 1. The profile likelihood is computed for the corresponding return level exceeded with probability $1/m$ .

xlow, xup

Lower and upper bounds of the return level grid over which the profile likelihood is evaluated.

conf

A numeric vector of confidence levels for profile likelihood confidence intervals.

nint

The number of grid points used to evaluate the profile likelihood.

Details

Value

A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.

Examples

## Not run: 
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat)
rld.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)
## Not run: 
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat)
rld.prof(fit, m = 100, xlow = 12, xup = 25)

## End(Not run)

Return Levels for the Logistic Distribution

Description

Computes return levels and their standard errors for a stationary logistic model fitted by rld.fit.

Usage

rld.rl(z, year = c(20, 50, 100, 200), show = FALSE)
rld.rl(z, year = c(20, 50, 100, 200), show = FALSE)

Arguments

z

An object returned by rld.fit. The fitted model should represent a stationary model.

year

A numeric vector of return periods for which return levels are to be computed.

show

Logical. If TRUE, the estimated return levels and their standard errors are printed.

Details

For a return period $T$ , the return level is defined as the quantile exceeded with probability $1/T$ . Under the logistic distribution, the return level is

$x_T = \mu + \sigma \log\left(\frac{1}{\exp(-\log(1-1/T)) - 1}\right),$

and standard errors are obtained using the delta method.

Value

The input object z with two additional components:

rl: a numeric vector of estimated return levels
rlse: a numeric vector of standard errors of the estimated return levels

Examples

x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
out <- rld.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
out <- rld.rl(fit, year = c(20, 50, 100))
out$rl
out$rlse

Summary of Fitted rLD Models over Different Values of r

Description

Summarizes fitted logistic distribution models for r-largest order statistics over $r = 1, \dots, R$ . For each value of r, the function fits the model using rld.fit and computes return levels using rld.rl.

Usage

rld.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)
rld.summary(
  data,
  r = NULL,
  ydat = NULL,
  mul = NULL,
  sigl = NULL,
  mulink = identity,
  siglink = identity,
  num_inits = 100,
  muinit = NULL,
  siginit = NULL,
  show = FALSE,
  method = "Nelder-Mead",
  maxit = 10000,
  ...
)

Arguments

data

A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period.

r

Optional integer giving the maximum number of order statistics to summarize. If NULL, all available columns are used.

ydat

A matrix or data frame of covariates for generalized linear modelling of the parameters, or NULL for stationary fitting.

mul, sigl

Integer vectors indicating which columns of ydat are used for the location and scale parameters, respectively.

mulink, siglink

Inverse link functions for the location and scale parameters, respectively.

num_inits

Number of initial parameter sets used in optimization.

muinit, siginit

Optional initial values for the location and scale parameters.

show

Logical. If TRUE, print details from model fitting.

method

Optimization method passed to optim.

maxit

Maximum number of iterations for optimization.

...

Additional arguments passed to rld.fit.

Value

A data frame containing:

r: number of order statistics used
nllh: negative log-likelihood
mu, sigma: parameter estimates
mu.se, sigma.se: standard errors
rl20, rl50, rl100, rl200: return levels
rl20.se, rl50.se, rl100.se, rl200.se: standard errors of return levels

Examples

x <- rldr(n = 50, r = 3, loc = 10, scale = 2)
rld.summary(x$rmat, num_inits = 5)
x <- rldr(n = 50, r = 3, loc = 10, scale = 2)
rld.summary(x$rmat, num_inits = 5)

Random Generation from the Logistic Distribution for r-Largest Order Statistics

Description

Generates random samples from the logistic distribution for $r$ -largest order statistics.

Usage

rldr(n, r, loc = 0, scale = 1)
rldr(n, r, loc = 0, scale = 1)

Arguments

n

A positive integer specifying the number of observations.

r

A positive integer specifying the number of order statistics for each observation.

loc

A numeric value specifying the location parameter.

scale

A positive numeric value specifying the scale parameter.

Details

The function first generates independent uniform random variables and then constructs decreasing transformed variables recursively. These are transformed by the logistic quantile function qld.

Value

A list with components:

umat: an n x r matrix of independent uniform random numbers
wmat: an n x r matrix of transformed uniform variables
rmat: an n x r matrix of simulated $r$ -largest order statistics

Examples

x <- rldr(10, 3, loc = 0, scale = 1)
x$rmat
x <- rldr(10, 3, loc = 0, scale = 1)
x$rmat

Package 'evmr'

Help Index

Bangkok Rainfall Data

Description

Usage

Format

Details

Source

References

Examples

Bevern Stream Flow Data

Description

Usage

Format

Details

Source

References

Examples

Fit and Compare r-Largest Order Statistics Models

Description

Usage

Arguments

Value

Examples

Oykel River Stream Flow Data

Description

Usage

Format

Details

Source

References

Examples

Quantile Function of the Gumbel Distribution

Description

Usage

Arguments

Details

Value

Examples

Quantile Function of the Generalized Gumbel Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Quantile Function of the Generalized Logistic Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Quantile Function of the Four-Parameter Kappa Distribution

Description

Usage

Arguments

Details

Value

References

Examples

Quantile Function of the Logistic Distribution

Description

Usage

Arguments

Details

Value

Examples

Fit the Gumbel Distribution to r-Largest Order Statistics

Description

Usage

Arguments

Value

References

See Also

Examples

Profile Likelihood for Return Levels under the rGD Model

Description