World Scientific Publishing Co. Pte. Ltd., 2004. – 192 pp. - Series in machine perception and artificial intelligence (Vol. 57)
The book covers the state-of-the-art research in several areas of time series data mining. Specific problems challenged by the authors of this volume are as follows.
Representation of Time Series. Efficient and effective representation of time series is a key to successful discovery of time-related patterns. The most frequently used representation of single-variable time series is piecewise linear approximation, where the original points are reduced to
a set of straight lines (segments). Chapter 1 by Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani provides an extensive and comparative overview of existing techniques for time series segmentation. In the view of shortcomings of existing approaches, the same chapter introduces an improved segmentation algorithm called SWAB (Sliding Window and Bottom-up).
Indexing and Retrieval of Time Series. Since each time series is characterized by a large, potentially unlimited number of points, finding two identical time series for any phenomenon is hopeless. Thus, researchers have been looking for sets of similar data sequences that differ only slightly from
each other. The problem of retrieving similar series arises in many areas such as marketing and stock data analysis, meteorological studies, and medical diagnosis.
An overview of current methods for efficient retrieval of time series is presented in Chapter 2 by Magnus Lie Hetland. Chapter 3 (by Eugene Fink and Kevin B. Pratt) presents a new method for fast compression and indexing of time series. A robust similarity measure for retrieval of noisy time series is described and evaluated by Michail Vlachos, Dimitrios Gunopulos, and Gautam Das in Chapter
4. Change Detection in Time Series. The problem of change point detection in a sequence of values has been studied in the past, especially in the context of time series segmentation (see above). However, the nature of real-world time series may be much more complex, involving multivariate and even graph data. Chapter 5 (by Gil Zeira, Oded Maimon, Mark Last, and Lior Rokach) covers the problem of change detection in a classification model induced by a data mining algorithm from time series data. A change detection procedure for detecting abnormal events in time series of graphs is presented by Horst Bunke and Miro Kraetzl in Chapter
6. The procedure
is applied to abnormal event detection in a computer network. Classification of Time Series. Rather than partitioning a time series into segments, one can see each time series, or any other sequence of data points, as a single object. Classification and clustering of such complex objects may be particularly beneficial for the areas of process control, intrusion detection, and character recognition. In Chapter 7, Carlos J. Alonso Gonz´alez and Juan J. Rodr´ıguez Diez present a new method for early classification of multivariate time series. Their method is capable of learning from series of variable length and able of providing a classification when only part of the series is presented to the classifier. A novel concept of representing time series by median strings (see Chapter 8, by Xiaoyi Jiang, Horst Bunke, and Janos Csirik) opens new opportunities for applying classification and clustering methods of data mining to sequential data.
As indicated above, the area of mining time series databases still includes many unexplored and insufficiently explored issues. Specific suggestions for future research can be found in individual chapters. In general, we believe that interesting and useful results can be obtained by applying
the methods described in this book to real-world sets of sequential data.
Segmenting time series: a survey and novel approach
A survey of recent methods for efficient retrieval of similar time sequences
Indexing of compressed time series
Indexing time-series under conditions of noise
Change detection in classification models induced from time series data
Classification and detection of abnormal events in time series of graphs
Boosting interval-based literals: variable length and early classification∗
Median strings: a review