When the incidence of count data may change over regions, the explanatory variables cannot fully describe the process and the Generalized Linear Model(GLM) is not appropriate. Hence, we consider Generalized Linear Mixed Model(GLMM) to introduce random effects. However, when dealing with a large number of samples, it will be difficult to calculate the high dimensional integrals involved, so it is an important issue to have a good way to deal with the calculation problem. This study aims at applying GLMM to spatial count data. The methods used here include Hierarchical Generalized Linear Models, Resolution Adaptive Fixed Rank Kriging, and Integrated Nested Laplace Approximation. We analyze simulation data and weed data collected at the Bjertorp farm in the south-west of Sweden. Different methods are compared based on the logarithmic Poisson loss function and root mean square error(RMSE).
論文審定書i
誌謝ii
摘要iii
Abstract iv
1 前言1
2 文獻回顧2
2.1 廣義線性混合模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.1 線性迴歸模型. . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.2 廣義線性模型. . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.3 線性混合模型. . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.4 廣義線性混合模型. . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 階層廣義線性模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 H-likelihood 估計和推論架構. . . . . . . . . . . . . . . . . . . 4
2.2.2 透過h-likelihood 計算邊際MLEs . . . . . . . . . . . . . . . . 5
2.3 INLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 介紹. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 INLA 模型. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.3 估計π(ui|θ, y) . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.4 估計過程. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.5 網格. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Resolution Adaptive Fixed Rank Kriging . . . . . . . . . . . . . . . . . 12
3 研究方法14
3.1 空間自相關. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 模型說明. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 平滑化. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 評估準則. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.1 h-likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4.2 對數卜瓦松損失函數. . . . . . . . . . . . . . . . . . . . . . . 16
3.4.3 均方根誤差. . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 資料介紹18
4.1 模擬資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 實際資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 研究結果19
5.1 模擬資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 預測結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2.1 σ2 = 10, ψ = 0.75 . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2.2 σ2 = 5, ψ = 0.75 . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.3 σ2 = 10, ψ = 1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.4 σ2 = 5, ψ = 1.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 真實資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6 結論與未來展望29
胡立諄和賴進貴(2006). 臺灣女性癌症的空間分析. 臺灣地理資訊學刊, (4):39--55.
Barndorff-Nielsen, O. (1980). Conditionality resolutions. Biometrika, 67(2):293--310.
Box, G. E. and Draper, N. R. (1987). Empirical model-building and response surfaces. John Wiley & Sons.
Collins, D. (2008). The performance of estimation methods for generalized linear mixed models. PhD thesis, University of Wollongong University of Wollongong.
Cox, D. R. and Hinkley, D. (1974). Theoretical statistics. Chapman and Hall, London
Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference. Journal of the Royal Statistical Society: Series B (Methodological), 49(1): 1--18.
Cressie, N. and Johannesson, G. (2008). Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1): 209--226.
Díaz-Avalos, C., Peterson, D. L., Alvarado, E., Ferguson, S. A., and Besag, J. E. (2001). Space time modelling of lightning-caused ignitions in the blue mountains, oregon. Canadian Journal of Forest Research, 31(9):1579--1593.
Ghosh, M., Rao, J., et al. (1994). Small area estimation: an appraisal. Statistical science, 9(1):55--76.
Gilmour, A. R., Cullis, B. R., and Verbyla, A. P. (1997). Accounting for natural and extraneous variation in the analysis of field experiments. Journal of Agricultural, Biological, and Environmental Statistics, pages 269--293.
Goldstein, H. (1986). Multilevel mixed linear model analysis using iterative generalized least squares. Biometrika, 73(1):43--56.
Guillot, G., Lorén, N., and Rudemo, M. (2009). Spatial prediction of weed intensities from exact count data and image-based estimates. Journal of the Royal Statistical Society: Series C (Applied Statistics), 58(4):525--542.
Hubin, A. and Storvik, G. (2016). Estimating the marginal likelihood with integrated nested laplace approximation (inla). arXiv preprint arXiv:1611.01450.
Lee, Y. and Nelder, J. A. (1996). Hierarchical generalized linear models. Journal of the Royal Statistical Society: Series B (Methodological), 58(4):619--656.
Lee, Y. and Nelder, J. A. (2001). Hierarchical generalised linear models: a synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika, 88(4):987--1006.
Lindgren, F., Rue, H., and Lindström, J. (2011). An explicit link between gaussian fields and gaussian markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(4):423--498.
McCllagh, P. and Nelder, J. A. (1989). Generalized linear models. chapman & hall london. Molas, M., Lesaffre, E., et al. (2011). Hierarchical generalized linear models: The r package hglmmm. Journal of Statistical Software, 39(13):1--20.
Nelder, J. A. and Wedderburn, R. W. (1972). Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3):370--384.
Noh, M. and Lee, Y. (2007). Reml estimation for binary data in glmms. Journal of Multivariate Analysis, 98(5):896--915.
Rue, H. and Held, L. (2005). Gaussian Markov random fields: theory and applications. CRC press.
Rue, H., Martino, S., and Chopin, N. (2009). Approximate bayesian inference for latent gaussian models by using integrated nested laplace approximations. Journal of the royal statistical society: Series b (statistical methodology), 71(2):319--392.
Severini, T. A. (2000). Likelihood methods in statistics. Oxford University Press.
Tzeng, S. and Huang, H.-C. (2018). Resolution adaptive fixed rank kriging. Technometrics, 60(2):198--208.
Ver Hoef, J. M., Peterson, E. E., Hooten, M. B., Hanks, E. M., and Fortin, M.-J. (2018). Spatial autoregressive models for statistical inference from ecological data. Ecological Monographs, 88(1):36--59.
Verbyla, A. P., Cullis, B. R., Kenward, M. G., and Welham, S. J. (1999). The analysis of designed experiments and longitudinal data by using smoothing splines. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(3):269--311.