Research of mixture of experts model for time series prediction
dc.contributor.author  Wang, Xin  en_NZ 
dc.date.available  20110407T03:16:59Z  
dc.date.copyright  20051115  en_NZ 
dc.identifier  http://adt.otago.ac.nz/public/adtNZDU20070312.144924  
dc.identifier.citation  Wang, X. (2005, November 15). Research of mixture of experts model for time series prediction (Thesis, Doctor of Philosophy). University of Otago. Retrieved from http://hdl.handle.net/10523/1487  en 
dc.identifier.uri  http://hdl.handle.net/10523/1487  
dc.description  xxiv, 237 leaves :ill. ; 30 cm. Includes bibliographical references. University of Otago department: Information Science. "15 November 2005".  
dc.description.abstract  For the prediction of chaotic time series, a dichotomy has arisen between local approaches and global approaches. Local approaches hold the reputation of simplicity and feasibility, but they generally do not produce a compact description of the underlying system and are computationally intensive. Global approaches have the advantage of requiring less computation and are able to yield a global representation of the studied time series. However, due to the complexity of the time series process, it is often not easy to construct a global model to perform the prediction precisely. In addition to these approaches, a combination of the global and local techniques, called mixture of experts (ME), is also possible, where a smaller number of models work cooperatively to implement the prediction. This thesis reports on research about ME models for chaotic time series prediction. Based on a review of the techniques in time series prediction, a HMMbased ME model called "Timeline" Hidden Markov Experts (THME) is developed, where the trajectory of the time series is divided into some regimes in the state space and regression models called local experts are applied to learn the mapping on the regimes separately. The dynamics for the expert combination is a HMM, however, the transition probabilities are designed to be timevarying and conditional on the "real time" information of the time series. For the learning of the "timeline" HMM, a modified Baum—Welch algorithm is developed and the convergence of the algorithm is proved. Different versions of the model, based on MLP, RBF and SVM experts, are constructed and applied to a number of chaotic time series on both onestepahead and multistepahead predictions. Experiments show that in general THME achieves better generalization performance than the corresponding single models in onestepahead prediction and comparable to some published benchmarks in multistepahead prediction. Various properties of THME, such as the feature selection for trajectory dividing, the clustering techniques for regime extraction, the "timeline" HMM for expert combination and the performance of the model when it has different number of experts, are investigated. A number of interesting future directions for this work are suggested, which include the feature selection for regime extraction, the model selection for transition probability modelling, the extension to distribution prediction and the application on other time series.  en_NZ 
dc.language  en  
dc.publisher  University of Otago  
dc.subject  chaotic time series  en_NZ 
dc.subject  “Timeline" Hidden Markov Experts  en_NZ 
dc.subject  multistepahead prediction  en_NZ 
dc.subject  timeline  en_NZ 
dc.subject  model  en_NZ 
dc.subject  distribution prediction  en_NZ 
dc.subject  time series  en_NZ 
dc.subject  Mixture of Experts  en_NZ 
dc.subject  Markov processes  
dc.subject.lcsh  T Technology (General)  en_NZ 
dc.subject.lcsh  Q Science (General)  en_NZ 
dc.subject.lcsh  HG Finance  en_NZ 
dc.subject.lcsh  HF5601 Accounting  en_NZ 
dc.title  Research of mixture of experts model for time series prediction  en_NZ 
dc.type  Thesis  en_NZ 
dc.description.version  Unpublished  en_NZ 
otago.date.accession  20060904  en_NZ 
otago.school  Information Science  en_NZ 
thesis.degree.discipline  Information Science  en_NZ 
thesis.degree.name  Doctor of Philosophy  
thesis.degree.grantor  University of Otago  en_NZ 
thesis.degree.level  Doctoral Theses  en_NZ 
otago.interloan  yes  en_NZ 
otago.openaccess  Abstract Only  
dc.identifier.eprints  389  en_NZ 
otago.school.eprints  Information Science  en_NZ 
dc.description.references  Aronszajn, N. (1950), Theory of reproducing kernels. Transactions of the American Mathematical Society, 68,337404. Atiya, A. F., ElShoura, S. M., Shaheen, S. I., and ElSherif, M. S. (1999), A comparison between neuralnetwork forecasting techniques — Case study: river flow forecasting. IEEE Transactions on Neural Networks, 10(2), 402409. Atkeson, C. G. (1992), Memorybased approaches to approximating continuous functions. Nonlinear Modeling and Forecasting, M. Casdagli and S. Eubank, eds., AddisonWesley, New York, 503521. Atkeson, C. G., Moore, A. W., and Schaal, S. (1997), Locally weighted learning. Articial Intelligence Review, 11,1173. Aupetit, M., Couturier, P., and Massotte, P. (2000), Function approximation with continuous selforganizing maps using neighboring influence interpolation. Proceedings of ICSC Symposia on Neural Computation (NC'2000), Berlin, Germany. Bakker, R., Schouten, J. C., Giles, C. L., Takens, F., and Van den Bleek, C. M. (2000), Learning chaotic attractors by neural networks. Neural Computation, 12(10), 2355– 2383. Barron, A. R. (1993), Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39(3), 930945. Baum, L. E. (1972), An inequality and associated maximization technique occuring in the statistical analysis of probabilistic functions of Markov chains. Inequalities, 3,18. Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970), A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics, 41,164171. Bellman, R. E. (1961), Adaptive control processes : A guided tour, Princeton University Press, Princeton, N.J. Bengio, Y., and Frasconi, P. (1995), An input output HMM architecture. Advances in Neural Information Processing Systems 7, G. Tesauro, D. S. Touretzky, and T. K. Leen, eds., MIT Press, Cambridge, MA, 427434. Bengio, S., Fessant, F., and Collobert, D. (1996), Use of modular architectures for timeseries prediction. Neural Processing Letters, 3(2), 101106. Bengio, Y., and Frasconi, P. (1996), Input/Output HMMs for sequence processing. IEEE Transactions on Neural Networks, 7(5), 12311249. Bengio, Y., Lauzon, V., and Ducharme, R. (2001), Experiments on the application of IOHMMs to model financial return series. IEEE Transactions on Neural Networks, 12(1), 113123. Bersini, H., Birattari, M., and Bontempi, G. (1998), Adaptive memorybased regression methods. Proceedings of the 1998 IEEE International Joint Conference on Neural Networks, 21022106. Bezdek, J. (1981), Pattern recognition with fuzzy objective function algorithms, Plenum Press, New York. Bilmes, J. A. (1998), A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. International Computer Science Institute, Berkeley, CA. Bishop, C. M. (1990), Curvaturedriven smoothing in backpropagation neural networks. Proceedings of International Neural Networks Conference (INNC'90), 749– 752. Bone, R., and Crucianu, M. (2002), Multistepahead prediction with neural networks: A review. Wines rencontres internationales: Approches Connexionnistes en Sciences, Boulogne sur Mer, France, 97106. Bontempi, G., Birattari, M., and Bersini, H. (1999), Local learning for iterated time series prediction. Machine Learning: Proceedings of the Sixteenth International Conference, San Francisco, CA, 3238. Box, G. E. P., and Jenkins, G. M. (1970), Time series analysis, forecasting and control, HoldenDay. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, P. J. (1984), Classification and regression trees, Wadsworth International Group, CA. Bridle, J. (1989), Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing: Algorithms, Architectures and Applications, F. Fogelman Souli'e and J. H'erault, eds., SpringerVerlag, 227236. Broomhead, D. S., and Lowe, D. (1988), Multivariable function interpolation and adaptive networks. Complex Systems, 2,321355. Burges, C. C. J. (1998), A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2,121167. Cao, L. (2003), Support vector machines experts for time series forecasting. Neurocomputing, 51,321339. Casdagli, M. (1989), Nonlinear prediction of chaotic time series. Physica D, 35,335– 356. Chan, K.S., and Tong, H. (2001), Chaos: a statistical perspective, SpringerVerlag, New York. Carroll, T. L. (1998), Multiple Attractors and Periodic Transients in Synchronized Chaotic Circuits. Physices Letter A, 238,365368. Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., and Freeman, D. (1988a), AutoClass: a Bayesian classification system. Proceedings of the Fifth International Conference on Machine Learning, 5464. Cheeseman, P., Stutz, J., Self, M., Kelly, J., Taylor, W., and Freeman, D. (1988b), Bayesian classification. Proceedings of the Seventh National Conference of Artificial Intelligence, 607611. Cheeseman, P., and Stutz, J. (1996), Bayesian Classification (AutoClass): theory and results. Advances in Knowledge Discovery and Data Mining, U. M. Fayyad, G. PiatetskyShapiro, P. Smyth, and R. Uthurusamy, eds., American Association for Artificial Intelligence Press/MIT Press, Menlo Park, CA, USA, 153180. Chen, H., and Liu, R.W. (1992), Adaptive distributed orthogonalization processing for principal components analysis. Proceedings of International Conference on Acoustics, Speech and Signal Processing, San Francisco, CA, 293296. Chen, R. (1995), Threshold variable selection in openloop threshold autoregressive models. Journal of Time Series Analysis, 16,461481. Chen, S., Billings, S. A., and Grant, P. M. (1990), Nonlinear system identification using neural networks. International Journal of Control, 51,11911214. Chudy, L., and Farkas, I. (1998), Prediction of chaotic timeseries using dynamic cell structures and local linear models. Neural Network World, 8(5), 481489. Cleveland, W. (1979), Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74,829836. Cleveland, W., and Devlin, S. (1988), Locally weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association, 83, 596609. [39] Cortes, C., and Vapnik, V. (1995), Supportvector networks. Machine Learning, 20(3), 273297. Cottrell, B. M., Girard, Y., Mangeas, M., and Muller, C. (1995), Neural modeling for time series: a statistical stepwise method for weight elimination. IEEE Trans on Neural Networks, 6(6), 13551364. Crowder, R. (1990), Predicting the MackeyGlass time series with cascade correlation learning. The Connectionists Models Summer School, 117123. Cybenko, G. (1989), Approximation by superpositions of a sigmoid function. Mathematics of Control, Signals and Systems, 2,303314. Dangelmayr, G., Gada]eta, S., Hundley, D., and Kirby, M. (1999), Time series prediction by estimating Markov probabilities through topology preserving maps. SPIE Vol. 3812, Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation II, 8693. De Groot, C., and Wurtz, D. (1991), Analysis of univariate time series with connectionist nets. A case study of two classical examples. Neurocomputing, 3(4), 177– 192. Dempster, A., Laird, N., and Rubin, D. (1977), Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B(39), 138. Deppisch, J., Bauer, H.U., and Geisel, T. (1991), Hierarchical training and its application to dynamical systems and prediction of chaotic time series. Physics Letters, 158,5762. Der, R., and Herrmann, M. (1994), Nonlinear chaos control by neural nets. Proceedings of International Conference on Artificial Neural Networks (ICANN'94), 12271230. Devijver, P., and Kittler, J. (1982), Pattern recognition. A statistical approach, Prentice Hall, Englewood Cliffs. Drucker, H., Burges, C., Kaufman, L., Smola, A. J., and Vapnik, V. (1997), Support vector regression machines. Advances in Neural Information Processing Systems 9, M. Mozer, M. Jordan, and T. Petsche, eds., MIT Press, Cambridge, MA. Drucker, H., Wu, D., and Vapnik, V. (1999), Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 10(5), 10481054. Elman, J. (1990), Finding structure in time. Cognitive Science, 14,179211. Elsner, J. B. (1992), Predicting time series using a neural network as a method of distinguishing chaos from noise. Journal of Physics A: Mathematical and General, 25, 843850. Engle, R. F. (1982), Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50,9871007. Epanechnikov, V. A. (1969), Nonparametric estimation of a multivariate probability density. Theory of Probability and its Applications, 14,153158. Fair, R. C, and Jaffee, D. M. (1972), Methods of estimation for markets in disequilibrium. Econometrica, 40,497514. Fan, J., and Yao, Q. (2003), Nonlinear time series, nonparametric and parametric methods, SpringerVerlag, New York. Fan, J., and Gijbels, I. (1996), Local polynominal modelling and its application, Chapman and Hall, London. Farmer, J. D., and Sidorowich, J. J. (1987), Predicting chaotic time series. Physical Review Letters, 59(8), 845848. Farmer, J. D., and Sidorowich, J. J. (1988), Exploiting chaos to predict the future and reduce noise. Evolution, Learning and Cognition, Y. C. Lee, eds., World Scientific Press, 277330. Fernandez, R. (1999), Predicting time series with a local support vector regression machine. Advanced Course on Artificial Intelligence 99. Flake, G. W., and Lawrence, S. (2002), Efficient SVM regression training with SMO. Machine Learning, 46(13), 271290. Fletcher, R. (1987), Practical methods of optimization, Jon Wiley and Sons. Fraser, A. M., and Dimitriadis, A. (1994), Forecasting probability densities by using hidden Markov models. Time Series Prediction: Forecasting the Future and Understanding the Past, A. S. Weigend and N. A. Gershenfeld, eds., AddisonWesley, MA, 265282. Friedman, J. (1991), Multivariate adaptive regression splines. Annals of Statistics, 19, 1142. Friedman, J. H. (1994), An overview of predictive learning and function approximation. From Statistics to Neural Networks, V. Cherkassky, J. H. Friedman, and H. Wechsler, eds., SpringerVerlag, 161. Funahashi, K. (1989), On the approximate realization of continuous mappings by neural networks. Neural Networks, 2,183192. Geladi, P., and Kowalski, B. R. (1986), Partial least squares regression: a tutorial. Analytica Chimica Acta, 185(1), 117. Geman, S., Bienestock, E., and Doursat, R. (1992), Neural networks and the bias/variance dilemma. Neural Computation, 4,158. Gers, F. A., Eck, D., and Schmidhuber, J. (2001), Applying LSTM to time series predictable through timewindow approaches. Proceeding of International Conference on Artificial Neural Networks (ICANN 2001), Vienna, Austria, 669675. Girosi, F. (1997), An equivalence between sparse approximation and support vector machines. MIT Artificial Intelligence Laboratory. Goldfeld, S. M., and Quandt, R. (1972), Nonlinear methods in econometrics, North Holland Publishing Co., Amsterdam. Goldfeld, S. M., and Quandt, R. (1973), A Markov model for switching regressions. Journal of Econometrics, 1,316. Gorr, W. L. (1994), Research prospective on neural network forecasting. International Journal of Forecasting, 10(1), 14. Gray, S. F. (]996), Modelling the conditional distribution of interest rates as a regimeswitching process. Journal of Financial Economics, 42,2762. Grosse, E. (1989), LOESS: Multivariate smoothing by moving least squares. Approximation Theory, C. K. Chui and L. L. Schumaker, eds., Academic Press, 299302. Hamilton, J. D. (1990), Analysis of time series subject to changes in regime, Journal of Econometrics, 45,3970. Hamilton, J. D., and Susmel, R. (1994), Autoregressive conditional heteroskedasticity and changes in regime. Journal of Econometrics, 64,307333. Hardie, W. (1990), Applied nonparametric regression, Cambridge University Press, Cambridge. Haykin, S. (1999), Neural networks. A comprehensive foundation, Macmillan College Publishing Company, N. J. Hill, T., Marquez, L., O'Connor, M., and Remus, W. (1994), Artificial neural network models for forecasting and decision making. International Journal of Forecasting, 10, 515. Hinton, G. E. (1989), Connectionist learning procedures. Artificial Intelligence, 40, 185234. Hornik, K., Stinchcombe, M., and White, H. (1989), Multilayer feedforward networks are universal approximations. Neural Networks, 2,359366. Hu, M. J. C. (1964), Application of the adaline system to weather forecasting, Technical Report 67751, Stanford Electronic Lab, Stanford, CA. Huber, P. J. (1964), Robust estimation of a location parameter. Annals of Mathematical Statistics, B(35), 73101. Huber, P. J. (1981), Robust statistics, Wiley, New York. Htibner, U., Weiss, C. 0., Abraham, N. B., and Tang, D. (1994), Lorenzlike chaos in NH 3FIR lasers. Time Series Prediction: Forecasting the Future and Understanding the Past, A. S. Weigend and N. A. Gershenfeld, eds., AddisonWesley, MA, 73104. Ikeda, K. (1979), Multiplevalued stationary state and its instability of the transmitted light by a ring cavity system. Optics Communications, 30(2), 257261. Inoue, H., Fukunaga, Y., and Narihisa, H. (2001), Efficient hybrid neural network for chaotic time series prediction. Proceedings of International Conference on Artificial Neural Networks (ICANN 2001), 712718. Jacobs, R. A. (1997), Bias/Variance analyses of mixturesofexperts architectures. Neural Computation, 9,369383. Jacobs, R. A., Jordan, M. I., Nowlan, S. J., and Hinton, G. E. (1991), Adaptive mixtures of local experts. Neural Computation, 3,7987. Jang, J.S. R. (1993), Anfis: Adaptivenetworkbased fuzzy inference system. IEEE Transactions on Systems, Man and Cybernetics, Part B: Cybernetics, 23(3), 665685. Jones, R., Lee, Y., Barnes, C., Flake, G., Lee, K., and Lewis, P. (1990), Function approximation and time series prediction with neural networks. Proceedings of International Joint Conference on Neural Networks (IJCNN1990), 649665. Jordan, M. I. (1986), Attractor dynamics and parallelism in a connectionist sequential machine. Eighth Annual Conference of the Cognitive Science Society, Englewood Cliffs, NJ: Erlbaum, 531546. Jordan, M. I., and Xu, L. (1995), Convergence results for the EM approach to mixtures of experts architectures. Neural Networks, 8,14091431. Kasabov, N., and Song, Q. (2002), DENFIS: dynamic evolving neuralfuzzy inference system and its application for time series prediction. IEEE Transactions on Fuzzy Systems, 10(2), 144154. Kohonen, T. (1982), Selforganized formation of topologically correct feature maps. Biological Cybernetics, 43,5969. Lapedes, A., and Farber, R. (1988), How neural nets work. Evolution learning and cognition, Y. C. Lee, eds., World Scientific, 331346. LeCun, Y. A., Jackel, L. D., Bottou, L., Brunot, A., Cortes, C., Denker, J. S., Drucker, H., Guyon, I., Muller, U. A., Sackinger, E., Simard, P., and Vapnik, V. N. (1995), Learning algorithms for classification: a comparison of learning algorithms for handwritten digit recognition. Neural Networks: The Statistical Mechanics Perspective, J. H. Oh, C. Kwon, and S. Cho, eds., World Scientific, 261276. Liehr, S., Pawelzik, K., Kohlmorgen, J., Lemm, S., and Muller, K.R. (1999), Hidden Markov mixtures of experts for prediction of nonstationary dynamics. Proceedings of Neural Networks for Signal Processing IX, IEEE, NJ, 195204. Liporace, L. A. (1982), Maximum likelihood estimation for multivariate observations of Markov source. IEEE Transactions on Information Theory, 28(5), 729734. Lippmann, R. (1989), Pattern classification using neural networks. IEEE Communications Magazine, 27(11), 4764. Littmann, E., and Ritter, H. (1996), Learning and generalization in cascade network architectures. Neural Computation, 8,15211539. Lorenz, E. N. (1963), Deterministic nonperiodic flows. Journal of Atmospheric Science, 20,130141. Lowe, D., and Webb, A. R. (1994), Time series prediction by adaptive networks: A dynamical systems perspective. Artificial Neural Networks, Forecasting Time Series, V. R. Vemuri and R. D. Rogers, eds., IEEE Computer Society Press, 1219. Mackey, M. C., and Glass, L. (1977), Oscillations and chaos in physiological control systems. Science, 197,287289. Martinez, T., Berkovich, S., and Schulten, G. (1993), "Neuralgas" network for vector quantization and its application to timeseries prediction. IEEE Transactions on Neural Networks, 4,558569. Mattera, D., and Haykin, S. (1999), Support vector machines for dynamic reconstruction of a chaotic system. Advances in Kernel Methods — Support Vector Learning, B. Scholkopf, C. Burges, and A. Smola, eds., MIT Press, 211242. McCullagh, P., and Nelder, J. A. (1989), Generalised linear models, monographs on statistics and applied probability, Chapman and Hall, London. MeNames, J. (1999), Innovations in local modeling for time series prediction, Ph.D. thesis, Stanford University. McNames, J., Suykens, J., and Vandewalle, J. (1999), Wining entry of the K. U. Leuven timeseries prediction competition. International Journal of Bifurcation and Chaos, 9(8), 14851500. Meir, R. ()994), Bias, variance and the combination of estimators; the case of linear least squares. Department of Electrical Engineering, Technion, Haifa, Israel. R. L., Machado, R. J., and Renteria, R. P. (1999), Timeseries forecasting through wavelets transformation and a mixture of expert models. Neurocomputing, 28(13), 145156. Moody, J., and Darken, C. J. (1989), Fast learning in networks of locallytuned processing units. Neural Computation, 1(2), 281294. Moran, P. A. P. (1953), The statistical analysis of the Canadian Lynx cycle I: Structure and prediction. Australian Journal of Zoology, 1,163173. Mukherjee, S., Osuna, E., and Girosi, F. (1997), Nonlinear prediction of chaotic time series using support vector machines. Proceeding of IEEE NNSP 97, 511519. Muller, K., Smola, A., Ratsch, G., Scholkopf, B., Kohlmorgen, J., and Vapnik, V. (1997), Predicting time series with support vector machines. Proceedings of International Conference on Artificial Neural Networks (ICANN'9 7) , 9991004. Muller, K., Smola, A., Misch, G., SchOlkopf, B., Kohlmorgen, J., and Vapnik, V. (1999), Using support vector machines for time series prediction. Advances in Kernel Methods Support Vector Learning, B. SchOlkopf, C. J. C. Burges, and A. J. Smola, eds., MIT Press, Cambridge, MA, 243254. Muller, K., Mika, S., Ratsch, G., Tsuda, K., and SchOlkopf, B. (2001), An introduction to kernelbased learning algorithms. IEEE Transactions on Neural Networks, 12(2), 181201. Nadaraya, E. A. (1964), On estimating egression. Theory of Probability and its Applications, 9,141142. Narendra, K., and Parthasarathy, K. (1990), Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, 1(1), 427. Niyogi, P., and Girosi, F. (1996), On the relationship between generalization error, hypothesis complexity and sample complexity for radial basis functions. Neural Computation, 8,819842. Ozaki, T. ()992), Identification of Nonlinearities and NonGaussinities in Time Series. New Direction in Time Seires Analysis, D. Brillinger, P. Gaines, J. Geweke, E. Parzen, M. Rosenblatt, and M. S. Taggu, eds., SpringerVerlag, New York, 227264. Parlos, A., Rais, 0., and Atiya, A. (2000), Multistepahead prediction in complex systems using dynamic recurrent neural networks. Neural Networks, 13(7), 765786. Pawelzik, K., and Schuster, H. G. (1991), Unstable periodic orbits and prediction. Physical Review, A, 43(4), 18081812. Platt, J. (1991), A resourceallocating network for function interpolation. Neural Computation, 3(2), 213255. Poggio, T., and Girosi, F. (1990), Regularization algorithms for learning that are equivalent to multilayer networks. Science, 247,978982. Poincare, H. (1952), Science and method, New York, Dover. Poli, I., and Jones, R. D. (1994), A neural net model for prediction. Journal of American Statistical Association, 89(425), 117121. Prechelt, L. (1994), PROBEN1–A set of benchmarks and benchmarking rules for neural network training algorithms. University of Karlsruhe, Germany. Priestley, M. B. (1965), Evolutionary spectral and nonstationary processes. Journal of the Royal Statistical Society, Series B(27), 204237. Priestley, M. B., and Tong, H. (1973), On the analysis of bivariate nonstationary processes (with discussion). Journal of the Royal Statistical Society, Series B(35), 153– 166. Principe, J. C., and Wang, L. (1995), Nonlinear time series modelling with selforganizing feature maps. 1995 Workshop: Neural Networks for Signal Processing V., 1120. Puskorius, G. V., Feldkamp, L. A., and Davis, L. I., Jr. (1996), Dynamic neural network methods applied to onvehicle idle speed control. Proceedings of the IEEE, 84(10), 14071420. Quandt, R. (1958), The estimation of parameters of a linear regression system obeying two separate regimes. Journal of the American Statistical Association, 53,873880. Quandt, R. E. (1972), A new approach to estimating switching regressions. Journal of the American Statistical Association, 67,306310. Quandt, R., and Ramsey, J. B. (1978), Estimating mixtures of normal distributions and switching regressions. Journal of the American Statistical Association, 73(364), 730– 752. Rabiner, L. R. (1989), A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257286. Rabiner, L. R., and Juang, B. H. (1986), An introduction to hidden Markov models. IEEE Acoustics, Speech & Signal Processing Magazine, 3,416. Renals, S. (1989), Radial basis function network for speech pattern classification. Electronic Letters, 25,437439. Rojas, I., Gonzalez, J., Canas, A., Diaz, A. F., Rojias, F. J., and Rodriguez , M. (2000), Shortterm prediction of chaotic time series by using RBF network with regression weights. International Journal of Neural Systems, 10(5), 353364. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986a), Learning internal representations by error propagation. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, D. E. Rumelhart and J. L. McClelland, eds., MIT Press, 318362. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986b), Learning representations by backpropagating errors. Nature, 323,533536. Rumelhart, D. E., Durbin, R., Golden, R., and Chauvin, Y. (1995), Backpropagation: The basic theory. Backpropagation: Theory, Architectures, and Applications, Y. Chauvin and D. E. Rumelhart, eds., Lawrence Erlbaum Associates, N. J., 134. Rynkiewicz, J. (1999), Hybrid HMM/MLP models for time series prediction. Proceedings of the 7th European Symposium on Artificial Neural Networks, Bruges, Belgium, 455462. Sanger, T. (1991), Treestructured adaptive networks for function approximation in highdimensional spaces. IEEE Transaction on Neural Networks, 2(2), 285293. Sauer, T. (1994), Time series prediction by using delayed coordinate embedding. Time Series Prediction: Forecasting the Future and Understanding the Past, A. S. Weigend and N. A. Gershenfeld, eds., Addison & Wesley, MA, 175193. Sauer, T., Yorke, J. A., and Casdagli, M. (1991), Embedology. Journal of Statistical Physics, 65(3/4), 579616. Scholkopf, B., Sung, K., Burges, C., Girosi, F., Niyogi, P., Poggio, T., and Vapnik, V. (1997), Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Transactions on Signal Processing, 45,27582765. Smola, A. J., and Scholkopf, B. (1998), A tutorial on support vector regression. Royal Holloway College, University of London, UK. Stollnitz, E. J., DeRose, T. D., and Salesin, D. H. (1995), Wavelets for computer graphics: A primer (part 1). IEEE Computer Graphics and Applications, 15(3), 7684. Suykens, J. A. K., Huang, A., and Chua, L. 0. (1997), A family of nscroll attractors from a generalized Chua's circuit. International Journal of Electronics and Communications, 51(3), 131138. Suykens, J. A. K., and Vandewalle, J. (2000), The K.U. Leuven competition data: a challenge for advanced neural network techniques. Proceedings of European Symposium on Artificial Neural Networks (ESANN'2000), Bruges, Belgium, 299304. Takens, F. (1980), Detecting strange attractors in turbulence. Proceedings of Symposium on Dynamical Systems and Turbulence, Lecture Notes in Mathematics, 366381. Tang, Z., Almeida, C., and Fishwick, P. A. (1991), Time series forecasting using neural networks vs. BoxJenkins methodology. Simulation, 57(5), 303310, Tay, F. E. H., and Cao, L. (2001), Application of support vector machines in financial time series forecasting. Omega, 29(4), 309317. Tay, F. E. H., and Cao, L. J. (2002), Modified support vector machines in financial time series forecasting. Neurocomputing, 48,847861. Tong, H. (1990), Nonlinear time series: A dynamical systems approach, Oxford University Press, Oxford. Tong, H. (2002), Nonlinear time series analysis since 1990: some personal reflections. Acta Mathematical Application Sinica, English Series, 18(2), 177184. Tong, H., and Lim, K. S. (1980), Threshold autoregression, limit cycles and cyclical data. Journal of the Royal Statistical Society, B(42), 245292. Trafalis, T. B., and Ince, H. (2000), Support vector machine for regression and applications to financial forecasting. Proceedings of International Conference on Neural networks (IJCNN2000), 348353. Van Gestel, T., Suykens, J., Baestaens, D., Lambrechts, A., Lanckriet, G., Vandaele, B., De Moor, B., and Vandewalle, J. (2001), Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Transactions on Neural Networks, Special Issue on Neural Networks in Financial Engineering, 12( 4), 809821. Vapnik, V. N. (1992), Principles of risk minimization for learning theory. Advances in Neural Information Processing Systems 4, J. E. Moody, S. J. Hanson, and R. P. Lippmann, eds., Morgan Kaufmann Publishers, San Mateo, CA, 831838. Vapnik, V. (1995), The nature of statistical learning theory, Springer, Berlin. Vapnik, V. N. (1998), Statistical learning theory, Wiley, New York. Vapnik, V. (1999), An overview of statistical learning theory. IEEE Transactions on Neural Networks, 10(5), 9881000. Vert, J. P., Tsuda, K., and Scholkopf, B. (2004), A Primer on Kernel Methods. Kernel Methods in Computational Biology, MIT Press, 3570. Vesanto, J. (1997), Using the SOM and local models in timeseries prediction. Proc. of Workshop on SelfOrganizing Maps, Helsinki University of Technology, 209214. Wahba, G. (1990), Spline models for observational data, SIAM, Philadelphia. Walter, J., Ritter, H., and Schulten, K. (1990), Nonlinear prediction with self organizing maps. Proceedings of International Joint Conference on Neural Networks IJCNN1990, 589594. Wan, E. (1993), Modeling nonlinear dynamics with neural networks: examples in time series prediction. Proceedings of the Fifth Workshop on Neural Networks: Academic/ Industrial/NASA /Defense, WNN93/FNN93, San Francisco, 327232. Wan, E. (1994), Time series prediction using a connectionist network with internal delay lines. Time Series Prediction: Forecasting the Future and Understanding the Past, A. Weigend and N. Gershenfeld, eds., AddisonWesley, 195218. Wang, X., Whigham, P., Deng, D., and Purvis, M. (2003), "Timeline" hidden Markov experts for time series prediction. Proceedings of IEEE International Conference on Neural Networks and Signal Processing (ICNNSP'03), 786789. Wang, X., Whigham, P., Deng, D., and Purvis, M. (2004), "Timeline" hidden Markov experts for time series prediction. Neural Information Processing  Letters and Reviews, 3(2), 3947. Watson, G. S. (1964), Smooth regression analysis. Sankhya  The Indian Journal of Statistics, 26,359372. Weeks, E. R., Tian, Y., Urbach, J. S., Ide, K., Swinney, H. L., and Ghil, M. (1997), Transitions Between Blocked and Zonal Flows in a Rotating Annulus with Topography. Science, 278(5343), 15981601. Weigend, A. S., Huberman, B. A., and Rumelhart, D. E. (1990), Predicting the future: A connectionist approach. International Journal of Neural Systems, 1,193209. Weigend, A. S., Rumelhart, D. E., and Huberman, B. A. ()991), Generalization by weightelimination with application to forecasting. Advances in Neural Information Processing Systems 3, R. P. Lippmann, J. Moody, and D. S. Touretzky, eds., San Mateo, CA: Morgan Kaufmann, 875882. Weigend, A. S., Huberman, B. A., and Rumelhart, D. E. (1992), Predicting sunspots and exchange rates with connectionist networks. Nonlinear Modeling and Forecasting, SF/ Studies in the Sciences of Complexity, M. Casdagli and S. Eubank, eds., AddisonWesley, 395432. Weigend, A. S., and Gershenfed, N. A. (1994a), The future of time series: learning and understanding. Time series prediction: forecasting the future and understanding the past, A. S. Weigend and N. A. Gershenfed, eds., Addison Wesley, 170. Weigend, A. S., and Gershenfed, N. A. (1994b), Time series prediction: forecasting the future and understanding the past. Addison Wesley. Weigend, A. S., Mangeas, M., and Srivastava, A. N. (1995), Nonlinear gated experts for time series: discovering regimes and avoiding overfitting. International Journal of Neural Systems, 6(4), 373399. Weigend, A. S., and Shi, S. (2000), Predicting daily probability distributions of S&P500 returns. Journal of Forecasting, 19,375392. Weiss, C. 0., and Klische, W. (1984), On observability of Lorenz instabilities in lasers. Optics Communications, 51(1), 4748. Wettschereck, D., and Dietterich, T. G. (1992), Improving the performance of radial basis function networks by learning center locations. Advances in Neural Information Processing Systems 4, J. E. Moody, Hanson, S. J., & Lippmann, R. P., eds., Morgan Kaufmann, San Francisco, CA., 11331140. Widrow, G., and Hoff, M. E. (1960), Adaptive switching circuits. IRE WESCON Convention Record, 4,96104. Wolberg, G. (1990), Digital image warping, IEEE Computer Society Press. Yang, H., Chan, L., and King, I. (2002), Support vector machine regression for volatile stock market prediction. Intelligent Data Engineering and Automated Learning (IDEAL'02) , 391396. Yao, X., and Liu, Y. ()997), A new evolutionary system for evolving artificial neural networks. IEEE Transactions on Neural Networks, 8(3), 694713. Yule, G. (1927), On a method of investigating periodicity in disturbed series with special reference to Wolfer's sunspot numbers. Philosophical Transactions of the Royal Society Series, 226A, 267298. Zeevi, A., Meir, R., and Adler, R. J. (1997), Nonlinear models for time series using mixtures of experts. Faculty of Electrical Engineering, Technion, Haifa, Israel. Zhang, G. P. (2003), Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50,159175.  en_NZ 
dc.identifier.voyager  1032918 
Files in this item
Files  Size  Format  View 

There are no files associated with this item. This item is not available in fulltext via OUR Archive. If you would like to read this item, please apply for an interlibrary loan from the University of Otago via your local library. If you are the author of this item, please contact us if you wish to discuss making the full text publicly available. 
This item appears in the following Collection(s)

Information Science [479]

Thesis  Doctoral [2640]