Chambers, Dorfman, and Wehrly [16] の推定量

4.5 補論

4.5.8 Chambers, Dorfman, and Wehrly [16] の推定量

Chambers, Dorfman, and Wehrly [16] はノンパラメトリック回帰法を用いてロバストな予測を行い、仮定したモデル(4.4)のもとでのバイアスの調整を行うNonparametric Calibration推定量を提案した。

Fˆ_CDW(t) =N⁻¹



X

i∈s

I(Y_i≤t) +X

j∈r

G(tˆ −βˆ₀−βˆ₁x_j) +X

j∈r

i∈s

w_ij h

I(Y_i≤t)−G(tˆ −βˆ₀−βˆ₁x_i) i



(4.42) ここで、w_ijはウェイト(4.26)である。モデルがデータに厳密に当てはまる場合、(4.42)の第3項が0となり

model-based推定量(4.17)と一致する。モデルのデータへの当てはまりが悪い場合でも、それによって生じる

バイアスが調整され、(4.17)よりも高い精度が期待できる。反対に、モデルが正しく定式化された場合でも、

実際には第3項が0になることはないので、この場合、推定値の精度は多少下がることになる。また、Fˆ_CDW は、ウェイトをπ⁻¹_i −1とすることにより、Fˆ_RKM と等しくなる。Fˆ_RKMとの相違は、Fˆ_RKMがウェイトとして、ラベルiの包含確率を利用するのに対して、Fˆ_CDW はカーネル推定を利用して標本からウェイト計算し、

利用している点である(Valliant, Dorfman and Royall [143] )。

5 おわりに

本稿では、ノンパラメトリック・カーネル推定を利用する多峰性検定と視覚的方法を利用して、「全国物価統計調査」の価格分布の分析を行った。また、有限母集団における累積分布関数推定量について検討した。

第2章では、カーネル密度推定量の性質を利用する多峰性検定を利用して、「全国物価統計調査」の価格分布の分析を行った。特に、指定商標銘柄の価格分布の多峰性検定を行い、多峰性が検出された銘柄について、

多峰性の要因分析を行った。ここでは、電気冷蔵庫、養毛剤、ビール、即席カレーの価格分布を分析した。本稿で取り上げていない銘柄も含め、多くの銘柄の価格分布は多峰分布であり、多峰性の要因分析を行った。多峰性の要因は店舗規模や銘柄によって大きく異なるが、経営戦略としての「ディスカウント販売」の有無、店舗の業態、競合店の有無や競合店の業態、立地環境、パート・アルバイト比率などが、主な要因として抽出された。特に、大規模店舗では経営戦略、小規模店舗では店舗の経営環境に関係する属性の影響が大きいという傾向が見られた。本稿で行った研究では、「商業統計調査」で調査されている小売り店舗の属性のうち、商品の仕入れ先や年間商品販売額など、店舗の価格形成に影響が大きいと考えられる属性のいくつかを利用していない。これらの属性を利用することにより、価格分布の多峰性の要因をより詳細に分析することが可能になると考えられる。この点は今後の検討課題である。

モード・ツリーやモード・フォレストなど、カーネル推定を利用して、分布の多峰性を視覚的に検討する方法が提案されている。第3章ではモード・ツリーとモード・フォレストを、多峰性検定でp値を評価する際に補助的に利用し、「全国物価統計調査」の価格分布の分析に援用した。また、モード・フォレストを二次元に拡張し、二変量モード・フォレストを考案した。そして、二次元の多峰型分布から乱数を発生させ、二変量モード・フォレストが、多峰性の検討に有益な方法であることを示した。「全国物価統計調査」では、一つの店舗で複数の銘柄価格が調査されている。ここでは、二次元価格分布の分析に二変量モード・フォレストが有益であることを示すため、二次元価格分布の擬似データを作成し検討を行った。この結果、二変量モード・フォレストが二次元価格分布の多峰性検出に有用であることが期待される。

第 4章では、有限母集団における分布関数推定について取り上げた。補助変数が与えられている場合に、

補助変数を利用して累積分布関数を推定するための推定量が提案されている。ここでは、主要な推定量すべてについてシミュレーション実験を行い、その性質について考察した。シミュレーションでは、無条件の性質と条件付き性質について、様々な角度から検討した。シミュレーション実験の結果から、モデルの定式化が適当なケースでは、Fˆ_CD推定量が優れたパフォーマンスを示した。モデルの定式化が不適当な場合には、

Fˆ_RKM,Fˆ_CDW などモデルの失敗の影響を受けにくい推定量がよい結果を示した。また、カーネル推定を利用するFˆ_{N RKM},Fˆ_CDW,Fˆ_DCK なども、よいパフォーマンスを示すことを明らかにした。また、Fˆ_CD,Fˆ_RKM については、分布関数の精度を評価するために、ジャックナイフ法で分散を推定する方法が提案されている。ここでは、Fˆ_CD,Fˆ_RKM 以外の推定量についても、ジャックナイフ法による分散の推定を試みた。そして、シミュレーション実験を行い、無条件、条件付き性質について検討した。その結果、Fˆ_CD,Fˆ_RKM,Fˆ_{N RKM},Fˆ_DCK など、カーネル推定を利用する推定量も含め、多くの推定量では、ジャックナイフ法による分散推定量が、無条件、条件付き性質ともによい結果を示すことを明らかにした。一方、v_jCD, v_jRKM 以外の推定量については、

ジャックナイフ分散推定量の一致性の証明が行われていない。また、ノンパラメトリック法を利用する推定量では、バンド幅を計算する方法が検討されておらず、バンド幅を適切に決めることができれば、よりよい結果が期待できる。これらについては今後の検討課題としたい。

このように本稿では、ノンパラメトリック平滑化法の代表であるカーネル推定の応用を中心に研究を行った。

カーネル推定は、適用範囲が広く様々な方向で応用が可能であると考えられる。今後も理論、応用両面での研究を進めていきたい。

謝辞：本稿の執筆に当たって、早稲田大学の野口和也教授、西郷浩教授、佐竹元一郎名誉教授、信州大学の舟岡史雄教授からはたいへん貴重なコメントをいただきました。この場をお借りし厚く御礼申し上げます。

参考文献

[1] Ahmad, I. A. (1980), “Nonparametric Estimation of an Affinity Measure between Two Absolutely Continuous Distributions with Hypotheses Testing Applications,”Annals of the Institute of Statistical Mathematics, 32, Part A, 223-240.

[2] Ahmad, I. A., and Li, Q. (1997a), “Testing Independence by Nonparametric Kernel Method,”Statistics and Probability Letters, 34, 201-210.

[3] Ahmad, I. A., and Li, Q. (1997b), “Testing Symmetry of an Unknown Density Function by Kernel Method,”Journal of the Nonparametric Statistics, 7, 279-293.

[4] Ahmad, I. A., and Van Bella, G. (1974), “Measuring Affinity of Distributions,” in Reliability and Biometry, Statistical Analysis of Life Testing, eds. Proschan and Serfling, SIAM, 651-668.

[5] Ait-Sahalia, Y. (1996a), “Testing Continuous Time Models of the Spot Interest Rate,” Review of Financial Studies, 2, 385-426.

[6] Ait-Sahalia, Y. (1996b), “Nonparametric Pricing of Interest Rate Derivative Securities,”Econometrica, 64, 527-560.

[7] Anderson, N. H., Hall, P., and Titterington, D. M. (1994), “Two Sample Test Statistics for Measur-ing Discrepancies between Two Multivariate Probability Density Functions,”Journal of Multivariate Analysis, 50, 41-54.

[8] Bickel, P. J. (1982), “On Adaptive Estimation,” The Annals of Statistics, 10, 647-671.

[9] Bickel, P. J., and Rosenblatt, M. (1973), “On Some Global Measure of the Deviations of Density Function Estimates,”The Annals of Statistics, 1, 1071-1095.

[10] Bowman, A. W. (1984), “An Alternative Method of Cross-Validation for the Smoothing of Density Estimates,”Biometrika, 71, 353-360.

[11] Bowman, A. W. (1992), “Density Based Tests for Goodness-of-Fit,”Journal of Statistical Computation and Simulation, 40, 1-13.

[12] Breidt, F. J., and Opsomer, J. (2000), “Local Polynomial Regression Estimators in Survey Sampling,”

The Annals of Statistics,28, 1026-1053.

[13] Cao, R., and Gonz´alez-Manteiga, W. (1994), “A Comparative Study of Several Smoothing Methods in Density Estimation,”Computational Statistics and Data Analysis,17, 153-176.

[14] Chambers, R. L. (1986), “Outlier Robust Finite Population Estimation,” Journal of the American Statistical Association,81, 1063-1069.

[15] Chambers, R. L., Dorfman, A. H., and Hall, P. (1992), “Properties of Estimators of the Finite Popu-lation Distribution Function,”Biometrika,79, 577-582.

[16] Chambers, R. L., Dorfman, A. H., and Wehrly, T. E. (1993), “Bias Robust Estimation in Finite

[17] Chambers, R. L., and Dunstan, R. (1986), “Estimating Distribution Functions from Survey Data,”

Biometrika,73, 597-604.

[18] Chaudhuri, P., and Marron, J. S. (1999), “SiZer for Exploration of Structure in Curves,” Journal of the American Statistical Association,94, 807-823.

[19] Chaudhuri, P., and Marron, J. S. (2000), “Scale Space View of Curve Estimation,” The Annals of Statistics, 28, 408-428.

[20] Cheng, M.-Y., and Hall, P. (1998), “Calibrating the Excess Mass and Dip Tests of Modality,”Journal of the Royal Statistical Society, Ser.B, 60, 579-590.

[21] Cheng, M.-Y., and Hall, P. (1999), “Mode Testing in Difficult Cases,” The Annals of Statistics, 27, 1294-1315.

[22] Chi, S. T. (1992), “An Automatic Bandwidth Selector for Kernel Density Estimate,”Biometrika, 79, 177-182.

[23] Chui, S.-T. (1990), “On the Asymptotic Distribution of Bandwidth Estimates,”The Annals of Statis-tics,18, 1696-1711.

[24] Chui, S.-T. (1991), “Bandwidth Selection for Kernel Density Estimation,” The Annals of Statistics, 19, 1883-1905.

[25] Chui, S.-T. (1992), “An Automatic Bandwidth Selector for Kernel Density Estimate,”Biometrika,79, 771-782.

[26] Cochran, W. G. (1977), Sampling Techniques,3rd ed., Wiley.

[27] Deville, J.-C., and S¨arndal, C.-E. (1992), “Calibration Estimators in Survey Sampling,” Journal of the American Statistical Association,87, 362-382.

[28] Devroye, L., and Gy¨orfi, L. (1985),Nonparametric Density Estimation: The L₁ view,Wiley.

[29] Dorfman, A. H. (1993), “A Comparison of Design-Based and Model-Based Estimator of the Finite Population Distribution Function,”Australian Journal of Statistics,35, 29-41.

[30] Dorfman, A. H. (1994), “A Note on Variance Estimation for the Regression Estimator in Double Sampling,”Journal of the American Statistical Association, 89, 137-140.

[31] Dorfman, A. H., and Hall, P. (1993), “Estimators of the Finite Population Distribution Function Using Nonparametric Regression,”The Annals of Statistics,21, 1452-1475.

[32] Efron, B. (1979), “Bootstrap Methods : Another Look at the Jackknife,”The Annals of Statistics, 7, 1-26.

[33] Efron, B., and Tibshirani, R. J. (1993),An Introduction to the Bootstrap, Chapman and Hall.

[34] Epanechnikov, V. A. (1969), “Non-Parametric Estimation of a Multivariate Probability Density,”

Theory of Probability and its Applications, 14, 153-158.

[35] Fan, J., Hall, P., Martin, M. A., and Patil, P. (1996), “On Local Smoothing of Nonparametric Curve Estimation,”Journal of the American Statistical Association, 91, 258-266.

[36] Fan, J., and Marron, J. S. (1994), “Fast Implementations of Nonparametric Curve Estimators,” Jour-nal of ComputatioJour-nal and Graphical Statistics, 3, 35-56.

[37] Fan, Y. (1994), “Testing Goodness of Fit of a Parametric Density Function by Kernel Method,”

Econometric Theory, 10, 316-356.

[38] Fan, Y., and Gencay, R. (1993), “Hypotheses Testing Based on Modified Nonparametric Estimation of an Affinity Measure between Two Distributions,”Journal of Nonparametric Statistics, 2, 389-403.

[39] Fan, Y., and Gencay, R. (1995), “A Consistent Nonparametric Test of Symmetry in Linear Regression Models,”Journal of the American Statistical Association, 90, 551-557.

[40] Fan, Y., and Ullah, A. (1999), “On Goodness-of-Fit Tests for Weakly Dependent Processes Using Kernel Method,”Journal of Nonparametric Statistics, 11, 337-360.

[41] Faraway, R. L., and Jhun, M. (1990), “Bootstrap Choice of Bandwidth for Density Estimation,”

Journal of the American Statistical Association, 85, 1119-1121.

[42] Fisher, N. I., Mammen, E., and Marron, J. S. (1994), “Testing for Multimodality,” Computational Statistics and Data Analysis,18, 499-512.

[43] Fisher, N. I., and Marron, J. S. (2001), “Mode Testing via the Excess Mass Estimate,” Biometrika, 88, 499-517.

[44] 舟岡史雄(2002), 「全国物価統計調査の利用の新たな視点」,『統計』, 53, 10-17.

[45] 伏見正則(1989), 『乱数』,東京大学出版会.

[46] Gonzalez-Rivera, G. (1997), “A Note on Adaptation in GARCH Models,” Econometric Reviews, 16, 55-68.

[47] Good, I. J. and Gaskins, R. A. (1971), “Nonparametric Roughness Penalties for Probability Densities,”

Biometrika, 58, 255-277.

[48] Good, I. J., and Gaskins, R. A. (1980), “Density Estimation and Bump-Hunting by the Penalized Likelihood Method Exemplified by Scattering and Meteorite Data,”Journal of the American Statistical Association, 75, 42-56.

[49] Hall, P. (1984), “Central Limit Theorem for Integrated Squared Error of Multivariate Nonparametric Density Estimators,” Journal of Multivariate Analysis, 14, 1-16.

[50] Hall, P., DiCiccio, T. J., and Romano, J. P. (1989), “On Smoothing and the Bootstrap,” The Annals of Statistics, 17, 692-704.

[51] Hall, P., and Marron, J. S. (1987), “Estimation of Integrated Squared Density Derivatives,” Statistics

[52] Hall, P., and Marron, J. S. (1991a), “Local Minima in Cross-Validation Function,” Journal of the Royal Statistical Society, Ser.B, 53, 245-252.

[53] Hall, P., and Marron, J. S. (1991b), “Lower Bounds for Bandwidth Selection in Density Estimation,”

Probability Theory and Related Fields, 90, 149-173.

[54] Hall, P., Marron, J. S., and Park, B.-U. (1992), “Smoothed Cross-Validation,”Probability Theory and Related Fields, 92, 1-20.

[55] Hall, P., Sheather, S. J., Jones, M. C. and Marron, P. S. (1991), “On Optimal Data-Based Bandwidth Selection in Kernel Density Estimation,”Biometrika, 78, 263-269.

[56] Hall, P., Wolfe, R. C., and Yao, Q. (1999), “Methods for Estimating a Conditional Distribution Function,”Journal of the American Statistical Association, 94, 154-163.

[57] Hall, P., and York, M. (2001), “On the Calibration of Silverman’s Test for Multimodality,” Statistica Sinica,11, 515-536.

[58] Hall, P., and Wood, A. T. A. (1996), “Approximations to Distributions of Statistics Used for Test-ing Hypotheses about the Number of Modes of a Population,” Journal of Statistical Planning and Inference,55, 299-317.

[59] Hansen, M. H., Madow, W. G., and Tepping, B. J. (1983), “An Evaluation of Model-Dependent and Probability-Sampling Inference in Sample Surveys,”Journal of the American Statistical Association, 78, 776-793.

[60] Hartigan, J.A., and Hartigan, P. M. (1985), “The Dip Test of Unimodality,”The Annals of Statistics, 13, 70-84.

[61] Hodges, J. L., and Lehmann, E. L. (1956), “The Efficiency of Some Nonparametric Competitors to the t-Test,” Annals of Mathematical Statistics,13, 324-335.

[62] Huang, L. (1997), “Testing Goodness-of-Fit Based on a Roughness Measure,”Journal of the American Statistical Association, 92, 1399-1402.

[63] 井出満(2002), 「価格分布から見た小売価格の実態」,『統計』, 53, 18-21.

[64] 石井太・關雅夫・西郷浩・樋田勉 (2004),「二相抽出法を利用した国民生活基礎調査所得分布推定の検討」,『2004年度統計関連学会連合大会講演報告集』, 54-55.

[65] Izenman, A. J., and Sommers, C. J. (1988),“Philatelic Mixtures and Multimodal Densities,” Journal of the American Statistical Association, 83, 941-953.

[66] Jones, M. C., Marron, J. S., and Park, B.-U. (1991), “A Simple Root n Bandwidth Selector,” The Annals of Statistics, 19, 1919-1932.

[67] Jones, M. C., Marron, J. S. and Sheather S. J. (1996a), “A Brief Survey of Bandwidth Selection for Density Estimation,”Journal of the American Statistical Association, 91, 401-407.

[68] Jones, M. C., Marron, J. S. and Sheather S. J. (1996b), “Progress in Data-Based Bandwidth Selection for Kernel Density Estimation,”Computational Statistics,11, 337-381.

[69] Jones, M. C., and Sheather, S. J. (1991),“Using Non-Stochastic Terms to Advantage in Kernel-Based Estimation of Integrated Squared Density Derivatives,”Statistics and Probability Letters,11, 511-514.

[70] Kuk, A. Y. C. (1988), “Estimation of Distribution Functions and Medians under Sampling with Unequal Probabilities,”Biometrika,75, 97-103.

[71] Kuk, A. Y. C. (1993), “A Kernel Method for Estimating Finite Population Distribution Functions Using Auxiliary Information,”Biometrika,80, 385-392.

[72] Kuk, A. Y. C., and Mak, T. K. (1989), “Median Estimation in the Presence of Auxiliary Information,”

Journal of the Royal Statistical Society, Series B,51, 261-269.

[73] Kuk, A. Y. C., and Mak, T. K. (1994), “A Functional Approach to Estimating Finite Population Distribution Functions,”Communications in Statistics, Part A-Theory and Methods, 23, 883-896.

[74] Kuk, A. Y. C., and Welsh, A. H. (2001), “Robust Estimation for Finite Populations Based on a Working Model,”Journal of the Royal Statistical Society, Series B,63, 277-292.

[75] Kullback, S., and Leibler, R. A. (1951), “On Information and Sufficiency,” Annals of Mathematical Statistics, 22, 79-86.

[76] Kuo, L. (1988), “Classical and Prediction Approaches to Estimating Distribution Functions from Sur-vey Data,”Proceedings of The Section on Survey Research Methods, American Statistical Association, 280-285.

[77] Li, Q. (1996), “Nonparametric Testing of Closeness between Two Unknown Distribution Functions,”

Econometric Reviews, 3, 261-274.

[78] Loftsgaarden, D. O., and Quesenberry, C. P. (1965) “A Nonparametric Estimate of a Multivariate Density Function,”Annals of Mathematical Statistics, 36, 1049-1051.

[79] Mak, T. K., and Kuk, A. Y. C. (1993), “A New Method for Estimating Finite-Population Quartiles Using Auxiliary Information,”The Canadian Journal of Statistics, 21, 29-38.

[80] Mammen, E., Marron, J. S., and Fisher, N. I. (1992), “Some Asymptotics for Multimodality Tests Based on Kernel Density Estimates,”Probability Theory and Related Fields, 91, 115-132.

[81] Marchette, D. J., and Wegman, E. J. (1997), “The Filtered Mode Tree,” Journal of Computational and Graphical Statistics, 6, 143-159.

[82] Marron, J. S. (1988), “Automatic Smoothing Parameter Selection: A Survey,”Empirical Economics, 13, 187-208.

[83] Marron, J. S. (1992), “Bootstrap Bandwidth Selection,” InExploring the Limits of the Bootstrap,eds.

R. LePage, and L. Billard, Wiley, 249-262.

ドキュメント内ノンパラメトリック・スムージング理論とその応用 (ページ 76-88)