Show Summary Details

Page of

PRINTED FROM the OXFORD RESEARCH ENCYCLOPEDIA, CLIMATE SCIENCE (oxfordre.com/climatescience). (c) Oxford University Press USA, 2019. All Rights Reserved. Personal use only; commercial use is strictly prohibited (for details see Privacy Policy and Legal Notice).

date: 13 November 2019

Summary and Keywords

Clustering techniques are used in the analysis of weather and climate to identify discrete groups of atmospheric and oceanic structures and evolutions that occur more frequently than would be expected based on a background distribution, such as a multivariate Gaussian distribution. Some of the techniques identify states that are also unusually long-lived (or persistent). Familiar examples of atmospheric states identified from cluster analysis include a small number of seasonal mean midlatitude response patterns to El Niño events, and on intra-seasonal timescales the North Atlantic Oscillation and the Pacific–North America patterns. On weather timescales, cluster analysis has been used to objectively identify a number of typical synoptic patterns familiar to forecasters. Cluster analysis has also been used to categorize cyclone tracks.

A large variety of clustering techniques are available. One approach is to determine whether the underlying probability distribution contains multiple, distinct peaks, and to identify these peaks. The existence of more than one peak would indicate the existence of preferred states. These techniques rely on kernel density estimation and mixture modeling, and are most successful when applied to a very low-dimensional representation of the state space. The identification of multiple preferred states in higher dimensional representations can be achieved with the k-means and hierarchical clustering techniques. These techniques can be applied to cyclone tracks as well as to the usual meteorological variables.

In certain applications it may be desirable to allow a given state to belong to multiple clusters with differing probabilities. The mixture modeling technique gives such probabilities, as does the fuzzy clustering generalization of the k-means approach. A technique that tries to objectively identify an ordered array of states (or patterns) that best fit the underlying distribution in some sense makes use of self-organizing maps.

An alternative approach that identifies not only preferred states but also ones that are unusually persistent is the Hidden Markov method. The Hidden Markov method makes use of an underlying “hidden variable” whose evolution is modeled by a Markov process. This method can be generalized further to detect long-term changes in the population of the clusters by letting the evolution of the hidden state be governed by a non-stationary finite element vector autoregressive factor process.

Keywords: clustering analysis, climate science, probability density estimation, climate statistics, clustering techniques

Access to the complete content on Oxford Research Encyclopedia of Climate Science requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription. If you are a student or academic complete our librarian recommendation form to recommend the Oxford Research Encyclopedias to your librarians for an institutional free trial.

Please subscribe or login to access full text content.

If you have purchased a print title that contains an access token, please see the token for information about how to register your code.

For questions on access or troubleshooting, please check our FAQs, and if you can't find the answer there, please contact us.