Extremes: Our advanced modelling

August 4th, 2016

Understanding extremes is often important in many fields (such as weather, finance and industrial standards). However, unlike the average which is widely used statistical summary, extremes need to be modelled but are hard to be modelled because they are rare. Unfortunately, theoretically based statistics frequently not work well. More importantly, the theories do not capture the common needs and understanding in practice. We crack these problems from different angles.


Improved traditional modelling practice (single value)

There are two traditional ways dealing with extreme values: Generalised Extreme Value (GEV) distribution for block maxima and Generalised Pareto (GP) distribution for the peak-over-threshold, both are normalised limit distributions of a random variable under regular condition. However, real applications do not often support these theoretical distributions.

By carefully reviewing the real data and understanding the theoretical issues, we developed a new distribution called Extended Burr Distribution which enjoy many advantages:

  • Statistically and practically sound: Flexible; Easy to interpret;
  • Including many frequently used models;
  • Close linkage to the theoretical distributions;
  • Only three parameter;
  • Analytic expression of distribution characteristics (density, distribution, moments and quantiles).

The distribution is easy to be implemented after the computational problems were fully solved and algorithms developed.

The applications so far include:

  • Pollutions/ Water Quality Guideline;
  • Flood and drought modelling
  • Flow regime prediction under a changing environment;
  • Ensemble generations of rainfall projection for hydrological model;
  • Survival analysis/risk analysis.


New event-based modelling practice (multi-dimension)

Traditional practice treats extremes as a sub-dataset based on a fixed spatial and temporal scales. However,

  • Such simple practice is not enough to reflect the real extreme in common understanding and is not enough to satisfy the real needs, because different time series variables used would give different results.
  • More seriously, the continuity or extension of a event is completely ignored, because the resulted sub-dataset has the same fixed time scale.

To solve these problem, we proposed a concept of event-based extreme modelling by defining an event and extracting measures from different aspects. By doing this:

  • A statistical variables becomes one of dimensions measuring the events;
  • Gain insight on different type of events, which is completely ignored before;
  • Understanding and identify the causes of different types of event. (physical causes and mechanisms).
  • The event model as a whole will improve all next-stage studies (i.e., impact).



  1. Shao (2000), Estimation for hazardous concentrations based on NOEC toxicity data: An Alternative approach, Environmetrics. 11:583-595.
  2. Shao (2002), A reparameterization method for embedded models. Communication in Statistic – Theory and Method, 31(5), 683-697.
  3. Shao (2002) Maximum likelihood estimation for generalized logistic distribution. Communication in Statistic – Theory and Method, 31(10), 1687-1700. (2002).
  4. Shao, (2004). Notes on maximum likelihood estimation for the three-parameter Burr XII distribution. Computational Statistics and Data Analysis 45(3): 675-687.
  5. Shao, W.-C. Ip and H. Wong (2004). Determination of Embedded Distributions. Computational Statistics and Data Analysis 46(2): 317-334.
  6. Shao, H. Wong, J. Xia and W.-C. Ip. (2004). Models for extremes using the extended three-parameter Burr XII system with application to flood frequency analysis. Hydrological Sciences: Journal 49(4): 685-702.
  7. Shao and X. Zhou (2004), A New Parametric Model for Survival Data with Long-term Survivors. Statistics in Medicine 23:3525-3543.
  8. Li and Q. Shao (2007). A chaos phenomenon of the number of near-maxima for Burr XII distributions. Metrika 66 (1): 89-104.
  9. Shao, Y.D. Chen and L. Zhang (2008). An extension of three-parameter Burr III distribution for low-flow frequency analysis. Computational Statistics and Data Analysis 52: 1304 – 1314.
  10. Shao, L. Zhang, Y.D. Chen and V.P. Singh (2009). A New Method for Modeling Flow Duration Curves and Predicting Streamflow Regimes under Altered Land Use Conditions. Hydrological Sciences Journal, 54(3): 606-622.

Li-Na Wang, Quanxi Shao and CHEN Xiao-Hong, Da-Gang Wang, Yan Li (2012). Flood Changes in Wujiang River, South China, During the Past 50 Years. Hydrological Processes. 26(23): 3561-3569.

Li-Na Wang, Quanxi Shao and  CHEN Xiao-Hong, Da-Gang Wang, Yan Li (2012). Flood Changes in Wujiang River, South China, During the Past 50 Years. Hydrological Processes. 26(23): 3561-3569.

Li-Na Wang, Xiao-Hong Chen, Quan-Xi Shao and Yan Li (2015). Flood indicators and their clustering features in Wujiang River, South China. Ecological Engineering 76: 66-74.