2 観察研究データに対するマッチング 1 : 1 マッチング対照群治療群

(1)

SASによる傾向スコアマッチング

Propensity score matching using SAS

○魚住龍史

1 _{* 矢田真城}

2 山本倫生

3 川口淳

4

1 京都大学大学院医学研究科医学統計生物情報学

2 エイツーヘルスケア株式会社

3 岡山大学

4 佐賀大学

Ryuji Uozumi

1 *, Shinjo Yada

2 ,

Michio Yamamoto

3 , and Atsushi Kawaguchi

4

1 _{Kyoto University}

2 _{A2 Healthcare Corporation}

3 _{Okayama University}

4 _{Saga University}

(2)

観察研究データに対するマッチング

対照群

治療群

 1 : 1 マッチング

(3)

観察研究におけるマッチング

対照群

治療群

(4)

SASユーザーによる傾向スコアマッチング

傾向スコア・・・

SAS・・・

PROC LOGISTIC・・・

(5)

本発表のトピック

• 傾向スコアによる最近傍マッチング

• JMP / R によるマッチング

(6)

 傾向スコア (バランススコア)

➢

: 被験者 i の群を表す変数

(

：治療群，

：対照群)

➢

: 被験者 i の共変量ベクトル

Rosenbaum and Rubin (1983) Biometrika

i

Z

i

X

傾向スコアモデル

)

|

1 Pr(

_i

i

Z

X

e



proc logistic

data

=<入力データセット名>;

class

Gender;

model

z(

event

=

'1'

)= Gender Age Bmi;

output out

=<出力データセット名>

pred

=ps;

run

;

例：X

_i

を性別 (Gender)，年齢 (Age)，BMI (Bmi) とする場合

1 

i

(7)

傾向スコアによる最近傍マッチング

対照群

治療群

i

e

j

e

|

e 

_i

e

_j

|

)

logit(

)

(

logit

|

e 

_i

e

_j

(8)

キャリパー (caliper)

対照群

試験群

i

e

j

e

|

e 

_i

e

_j

)]

(

SD[logit

Caliper

|

)

logit(

)

(

logit

|

e

_i



e

_j





e

_

Rosenbaum and Rubin (1985) Am Stat









20 .

0

25 .

0 Caliper

(9)

最近傍マッチングを用いた

臨床研究の論文における記載例

Bangalore et al. (2015) N Engl J Med; 田栗 (2017) Coronary Intervention

Matching was performed with the use of a 1:1

matching protocol without replacement

(greedy-matching algorithm), with a caliper

width equal to 0.2 of the standard deviation of

the logit of the propensity score.

• マッチング後，調整した変数の群間の分布を確認

• バイアスが生じたままである場合はキャリパーの値を再考

(10)

JMP (11.0以上) による最近傍マッチング

• アドイン入手URL :

(11)

(12)

JMPにおけるキャリパーの指定

2 ]

0 |

)

logit(

[

Var

]

1 |

)

logit(

[

Var

Caliper

|

)

logit(

)

(

logit

|













 

Z

e

Z

e

_i _j

)]

(

SD[logit

Caliper

|

)

logit(

)

(

logit

|

e

_i



e

_j





e

_

Rosenbaum and Rubin (1985) Am Stat









20 .

0

25 .

0 Caliper

(13)

(14)

最近傍マッチングの事例

R を用いた最近傍マッチングの事例

 Yamashita et al. (2017). Current status and outcomes of direct

oral anticoagulant use in patients with atrial fibrillation in the

real-world: The Fushimi AF Registry. Circ J.

JMP を用いた最近傍マッチングの事例

 Hamatani et al. (2015). Low body weight is associated with the

incidence of stroke in atrial fibrillation patients: Insight from the

Fushimi AF Registry. Circ J.

 Hida et al. (2017). Open versus laparoscopic surgery for

advanced low rectal cancer: a large multicenter propensity score

matched cohort study in Japan. Ann Surg.

(15)

 R による最近傍マッチング

### パッケージを使用 ###

library(Matching)

### 傾向スコアの推定結果が含まれたCSVファイルをデータセットとして読み込み ###

data = read.csv("psout00.csv")

### 傾向スコアモデルの分子(治療群のカテゴリ)を指定 ###

Trt = data$Drug == "Drug_X"

### キャリパーの値を指定 ###

caliper = 0.20

### 傾向スコアのロジット変換 ###

data <- transform(data, ps.logit=log(ps / (1 - ps)))

### 乱数のシード ###

set.seed(123456)

### 1:1最近傍マッチングを非復元抽出で実行 ###

Matching =

Match

(Y=NULL, Tr=Trt, X=data$ps.logit, M=1, replace=FALSE, caliper = caliper);

summary(Matching)

(16)

SASによる傾向スコアマッチング

SAS でマッチングを

やりたい・・・

DATAステップで傾向スコアマッチングを行うため

のSASマクロの報告 (Coca-Perraillon, 2007;

Lanehart et al., 2012)

SASマクロで処理するならRと同じだし，

SAS/STATのプロシジャでマッチングができない

ものだろうか・・・

(17)

12.1

9.3 2012/08

2011/07

12.3 _13.1

_13.2

2013/07

2013/12

2014/08

SAS 9.3

SAS 9.4

SAS/STAT

14.1 2015/07

14.2 2016/11

PSMATCHプロシジャ

(18)

PSMATCHプロシジャの主なステートメントの役割

共変量

の指定

傾向スコア

の算出

マッチング

条件・方法の指定

マッチング後の

データセット出力

共変量の

バランスを確認

PSMODEL

MATCH

OUTPUT

ASSESS

proc psmatch

data=<入力データセット名>;

class

Z <共変量におけるカテゴリカル変数>;

psmodel

Z(treated=

‘1’

)=<共変量>;

match

<マッチング条件・方法の指定>;

assess

<共変量などのバランスの評価>;

output

out(obs=match)=<出力データセット名>;

run

;

(19)

観察研究のデータセット

_:

Drugs (n = 486)

• 本発表で用いる変数

• 治療

_{(Drug = ‘Drug_X’ 「治療群」, ‘Drug_A’ 「対照群」)}

• 年齢 (Age)

• 性別

_{(Gender = ‘Male’, ‘Female’)}

(20)

proc psmatch

data=drugs region=allobs;

class

Drug Gender;

psmodel

Drug(treated=

'1'

)

= Gender Age Bmi;

output

out(obs=all)=psout00

ps=ps attwgt=attw atewgt=atew;

run

;

PSMODEL: 傾向スコアモデル

Gender Age

Bmi

Drug

Patient

ID

PS

ATEW

ATTW

Male

29 22.02 Drug_X

284 0.36444

2.744

1 Male

45 26.68 Drug_A

201 0.22296 1.2869 0.28694

Male

42 21.84 Drug_A

147 0.11323 1.1277 0.12768

データセットPSOUT00 (最初の3オブザベーション)

(21)

21 proc psmatch

data=Drugs region=allobs;

class

Drug Gender;

psmodel

Drug(treated=

'Drug_X'

) = <共変量>;

match

method=greedy(k=

1 order=descending)

stat=lps caliper(mult=stddev)=

0.20 ;

output

out(obs=match)=psmatch00

ps=ps lps=lps

matchwgt=match matchid=matchsort;

run

;

MATCH: 傾向スコアマッチング実行

 1 : 1 最近傍マッチング

)]

logit(

[

SD

Caliper

|

)

logit(

)

(

logit

|

e

_i



e

_j





e

_

(22)

22 MATCHステートメントで指定可能なマッチング方法

Replacement

Optimal

Fixed Ratio Matching

Variable Ratio Matching

Full Matching Greedy Nearest Neighbor Matching METHOD = REPLACE METHOD = GREEDY METHOD = VARRATIO METHOD = OPTIMAL METHOD = FULL No Replacement Matching Yes Yes No Fixed Ratio Variable Ratio Yes No No Yes

(23)

23 proc psmatch

data=Drugs region=allobs;

class

Drug Gender;

psmodel

Drug(treated=

'Drug_X'

) = <共変量>

match

<マッチング条件・方法の指定>;

output

out(obs=match)=psmatch00 ps=ps lps=lps

matchwgt=match matchid=matchsort;

run

;

proc sort

data

=psmatch00

out

=pssort00;

by descending

ps;

run

;

OUTPUT: マッチング後のデータセット

データセットPSSORT00 (最初の5オブザベーション)

Gender Age

Bmi

Drug

PatientID

PS

match matchsort

Male

27 25.87 Drug_X

87 0.64115

1

1 Male

27 25.76 Drug_A

123 0.63513

1

1 Female

30 27.75 Drug_A

200 0.61831

1

2 Male

33 28.06 Drug_X

349 0.60481

1

2 Male

27 25.14 Drug_X

114 0.60047

1

3

(24)

ods graphics on

;

proc psmatch

data=drugs region=allobs;

class

Drug Gender;

psmodel

Drug(treated=

‘Drug_X'

) = <共変量>;

match

method=greedy(k=

1 order=descending)

stat=lps caliper(mult=stddev)=

0.20 ;

assess

lps var=(Gender Age Bmi)

/ weight=none varinfo plots=(all);

run

;

ASSESS: 変数の群間差の評価

(25)

25 マッチング前後の変数情報

Variable Information

Variable

Obs

Treated (Drug = Drug_X)

Control (Drug = Drug_A)

N

Mean

Std Dev

N

Mean

Std Dev

LPS

All

113 -0.880615

0.681761

373 -1.520586

0.844486

Region

113 -0.880615

0.681761

373 -1.520586

0.844486

Matched

113 -0.880615

0.681761

113 -0.889200

0.674687

Age

All

113 36.309735

5.534114

373 40.404826

6.579103

Region

113 36.309735

5.534114

373 40.404826

6.579103

Matched

113 36.309735

5.534114

113 35.884956

5.468664

Bmi

All

113 24.492566

1.863797

373 23.753271

1.980778

Region

113 24.492566

1.863797

373 23.753271

1.980778

Matched

113 24.492566

1.863797

113 24.309027

1.882123

Gender

All

113 0.433628

0.495575

373 0.458445

0.498270

Region

113 0.433628

0.495575

373 0.458445

0.498270

Matched

113 0.433628

0.495575

113 0.495575

0.499980

治療群

対照群

マッチング前

113名

373名

マッチング後

113名

(26)

各変数に対する標準化した群間差

Standardized Variable Differences (Treated - Control)

Variable

Standardized Mean Difference

Mean Difference

Divisor

Mean Difference

All Obs

Region

Obs

Matched

Obs

All Obs

Region

Obs

Matched

Obs

LPS

0.639971

0.008584 0.767448

0.833894

0.011185

Age

-4.095091

0.424779 6.079104

-0.673634

0.069875

Bmi

0.739296

0.18354 1.923178

0.384414

0.095436

Gender

-0.024817

-0.024817 -0.061947 0.496925

-0.049941

-0.124661

0.673634

=

079104

.

6 095091

.

4 Divisor

Difference

Mean

Difference

Mean

ed

Standardiz





年齢：

(27)

27 マッチング前後の変数情報

Variable Information

Variable

Obs

Treated (Drug = Drug_X)

Control (Drug = Drug_A)

N

Mean

Std Dev

N

Mean

Std Dev

LPS

All

113 -0.880615

0.681761

373 -1.520586

0.844486

Region

113 -0.880615

0.681761

373 -1.520586

0.844486

Matched

113 -0.880615

0.681761

113 -0.889200

0.674687

Age

All

113 36.309735

5.534114

373 40.404826

6.579103

Region

113 36.309735

5.534114

373 40.404826

6.579103

Matched

113 36.309735

5.534114

113 35.884956

5.468664

Bmi

All

113 24.492566

1.863797

373 23.753271

1.980778

Region

113 24.492566

1.863797

373 23.753271

1.980778

Matched

113 24.492566

1.863797

113 24.309027

1.882123

Gender

All

113 0.433628

0.495575

373 0.458445

0.498270

Region

113 0.433628

0.495575

373 0.458445

0.498270

Matched

113 0.433628

0.495575

113 0.495575

0.499980

6.079104

=

2 )

6.579103

+

(5.534114

2 2



年齢のマッチング前の

プールした標準偏差

(28)

各変数に対する標準化した群間差

Standardized Variable Differences (Treated - Control)

Variable

Standardized Mean Difference

Mean Difference

Divisor

Mean Difference

All Obs

Region

Obs

Matched

Obs

All Obs

Region

Obs

Matched

Obs

LPS

0.639971

0.008584 0.767448

0.833894

0.011185

Age

-4.095091

0.424779 6.079104

-0.673634

0.069875

Bmi

0.739296

0.18354 1.923178

0.384414

0.095436

Gender

-0.024817

-0.024817 -0.061947 0.496925

-0.049941

-0.124661

0.673634

=

079104

.

6 095091

.

4 Divisor

Difference

Mean

Difference

Mean

ed

Standardiz





年齢：

(29)

29 各変数に対する標準化した群間差

(30)

(31)

31

(32)

(33)

SASユーザー総会における

傾向スコアを用いた論文報告

2010

古川・杉本

2011

2017

古川

本発表

Propensity Score法によるバイアスの

調整法に関する実務的な問題点

観察研究において選択bias を制御す

るために用いられる Propensity

Score IPTW と層化調整法の、頑健

性の観点からの使い分けについて

PSMATCHプロシジャ登場!!

最近傍マッチングについて

SASプログラムの詳細は

論文集を参照

(34)

12.1

9.3 2012/08

2011/07

12.3 _13.1

_13.2

2013/07

2013/12

2014/08

SAS 9.3

SAS 9.4

SAS/STAT

14.1 2015/07

14.2 2016/11

PSMATCHプロシジャ

CAUSALTRTプロシジャ

(35)

35 References

1. Austin PC, Grootendorst P, Anderson GM. A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study.

Statistics in Medicine. 26:734–753, 2007.

2. Austin PC. Optimal caliper widths for propensity-score matching when estimating differences in means and differences in proportions in observational studies. Pharmaceutical Statistics. 10:150– 161, 2011.

3. Bangalore S, Guo Y, Samadashvili Z, Blecker S, Xu J, Hannan EL. Everolimus-eluting stents or bypass surgery for multivessel coronary disease. New England Journal of Medicine. 372:1213–1222, 2015.

4. Coca-Perraillon M. Local and global optimal propensity score matching. Proceedings of the SAS

Global Forum. SAS Institute Inc., Cary, NC, 2007.

5. Faries D, Leon AC, Haro JM, Obenchain RL. Analysis of Observational Health Care Data Using

SAS(R). SAS Institute Inc., Cary, NC, 2010.

6. Hida K, Okamura R, Sakai Y, Konishi T, Akagi T, Yamaguchi T, Akiyoshi T, Fukuda M, Yamamoto S, Yamamoto M, Nishigori T, Kawada K, Hasegawa S, Morita S, Watanabe M. Open versus

laparoscopic surgery for advanced low rectal cancer: a large multicenter propensity score matched cohort study in Japan. Annals of Surgery, 2017.

7. Hamatani Y, Ogawa H, Uozumi R, Iguchi M, Yamashita Y, Esato M, Chun YH, Tsuji H, Wada H, Hasegawa K, Abe M, Morita S, Akao M. Low body weight is associated with the incidence of stroke in atrial fibrillation patients: Insight from the Fushimi AF Registry. Circulation Journal. 79:1009– 1017, 2015.

8. JMPジャパン事業部. JMPによる傾向スコアを用いたマッチング、層別分析、回帰分析. SAS Institute Japan株式会社, 2014.

(36)

References (Cont’d)

9. Lanehart RE, de Gil PR, Kim ES, Bellara AP, Kromrey JD, Lee RS. Propensity score analysis and assessment of propensity score approaches using SAS(R) procedures. Proceedings of the SAS Global

Forum. SAS Institute Inc., Cary, NC, 2012.

10. Normand ST, Landrum MB, Guadagnoli E, Ayanian JZ, Ryan TJ, Cleary PD, McNeil BJ. Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. Journal of Clinical Epidemiology. 54:387–398, 2001. 11. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for

causal effects. Biometrika. 70:41–55, 1983.

12. Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician. 39:33–38, 1985. 13. SAS Institute Inc. SAS/STAT(R) 14.2 User’s Guide. SAS Institute Inc., Cary, NC, 2016.

14. Yamashita Y, Uozumi R, Hamatani Y, Esato M, Chun YH, Tsuji H, Wada H, Hasegawa K, Ogawa H, Abe M, Morita S, Akao M. Current status and outcomes of direct oral anticoagulant use in patients with atrial fibrillation in the real-world: The Fushimi AF Registry. Circulation Journal, 2017

DOI:10.1253/circj.CJ-16-1337.

15. 田栗正隆. 有名論文から統計の基礎を学ぶ：プロペンシティスコア. Coronary Intervention, 13:63– 69, 2017.

16. 古川敏仁・杉本典子. Propensity Score法によるバイアスの調整法に関する実務的な問題点. SAS ユーザー総会論文集. 101–108, 2010.

17. 古川敏仁. 観察研究において選択bias を制御するために用いられる Propensity Score IPTW と層化調整法の、頑健性の観点からの使い分けについて. SASユーザー総会論文集. 409–416, 2011. 18. 山本倫生・森田智視. 傾向スコアによる調整解析. 呼吸. 34:1187–1193, 2015.

2 観察研究データに対するマッチング 1 : 1 マッチング 対照群 治療群

SASによる傾向スコアマッチング

Propensity score matching using SAS

○魚住 龍史

1

* 矢田 真城

2

山本 倫生

3

川口 淳

4

1

京都大学大学院医学研究科 医学統計生物情報学

2

エイツーヘルスケア株式会社

3

岡山大学

4

佐賀大学

Ryuji Uozumi

1

*, Shinjo Yada

2

,

Michio Yamamoto

3

, and Atsushi Kawaguchi

4

1

Kyoto University

2

A2 Healthcare Corporation

3

Okayama University

4

Saga University

観察研究データに対するマッチング

対照群

治療群

 1 : 1 マッチング

観察研究におけるマッチング

対照群

治療群

SASユーザーによる傾向スコアマッチング

傾向スコア・・・

SAS・・・

PROC LOGISTIC・・・

本発表のトピック

•

傾向スコアによる最近傍マッチング

•

JMP / R によるマッチング

 傾向スコア (バランススコア)

➢

: 被験者 i の群を表す変数

(

：治療群，

：対照群)

➢

: 被験者 i の共変量ベクトル

Rosenbaum and Rubin (1983) Biometrika

Z

X

傾向スコアモデル

)

|

1

Pr(

i

i

i

Z

X

e





proc logistic

data

=<入力データセット名>;

class

2 観察研究データに対するマッチング 1 : 1 マッチング対照群治療群

○魚住龍史

_{* 矢田真城}

山本倫生

川口淳

京都大学大学院医学研究科医学統計生物情報学

_{Kyoto University}

_{A2 Healthcare Corporation}

_{Okayama University}

_{Saga University}

_i

_i

_i

_j

_i

_j

_i

_j

_i

_j