JAIST Repository: ブースティング手法を用いた分類システムの精度向上
全文
(2) ୃ ჻ ⺰ ᢥ. ࡉࠬ࠹ࠖࡦࠣᚻᴺࠍ↪ߚ ಽ㘃ࠪࠬ࠹ࡓߩ♖ᐲะ. ᜰዉᢎቭ Ho Tu Bao ᢎ. ർ㒽వ┵⑼ቇᛛⴚᄢቇ㒮ᄢቇ ⍮⼂⑼ቇ⎇ⓥ⑼⍮⼂ࠪࠬ࠹ࡓၮ␆ቇኾ. 150003. ✍Ꮉ ⡡ม. ክᩏᆔຬ㧦 Ho Tu Bao ᢎ㧔ਥᩏ㧕 ⍹ፒ 㓷ੱ ഥᢎ ⮮ ⾫ੑ ഥᢎ ᨋ ᐘ㓶 ഥᢎ 2003 ᐕ 2 Copyright հ 2003 by Satoshi Ayakawa.
(3) ⋡ ᰴ. 㧝 ߪߓߦ 㧝 㧝㧚㧝⎇ⓥߩ⢛᥊ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧝 㧝㧚㧞⎇ⓥߩ⋡⊛ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧞 㧝㧚㧟ᧄ⺰ᢥߩ᭴ᚑ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧞 㧞 ಽ㘃ࠪࠬ࠹ࡓ 㧟 㧞㧚㧝ಽ㘃ࠪࠬ࠹ࡓߣߪ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧟 㧞㧚㧞ቯᧁࠍ↪ߚಽ㘃(C4.5) 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧡 㧞㧚㧞㧚㧝ቯᧁߩᚑ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧢 㧞㧚㧞㧚㧞ᧁߩᨑಿࠅ(Tree Pruning) 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧣 㧞㧚㧟ಽ㘃ࠪࠬ࠹ࡓߩ⹏ଔᣇᴺ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧤 㧟 ࡉࠬ࠹ࠖࡦࠣ(Boosting) 㧝㧜 㧟㧚㧝ࡉࠬ࠹ࠖࡦࠣߣߪ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧜 㧟㧚㧞AdaBoost 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧝 㧟㧚㧟AdaBoost ߩ․ᓽ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧟 㧠 AdaBoost ߩᡷ⦟. 㧝㧡. 㧠㧚㧝ᡷ⦟ߩឭ᩺㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧡 㧠㧚㧞ࡕ࠺࡞㧝(AdaBoost_M1) 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧢 㧠㧚㧟ࡕ࠺࡞㧞(AdaBoost_M2) 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝8 㧡 ࠪࠬ࠹ࡓߩⷐ 㧞㧜 㧡㧚㧝ⷐ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧜 㧡㧚㧞ಽ㘃ࠪࠬ࠹ࡓ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧝 㧡㧚㧟⹏ଔࠪࠬ࠹ࡓ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧠. i.
(4) 㧢 ታ㛎 㧞㧢 㧢㧚㧝࠺࠲࠶࠻ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧢 㧢㧚㧞ታ㛎ߩⷐ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧤 㧢㧚㧟ታ㛎ߩᚻ㗅㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧤 㧢㧚㧠ታ㛎ߩ⚿ᨐߣ⠨ኤ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧥 㧢㧚㧠㧚㧝C4.5 ߣ BC4.5 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧥 㧢㧚㧠㧚㧞BC4.5 ߣ BC4.5_M1 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧟㧟 㧢㧚㧠㧚㧟BC4.5 ߣ BC4.5_M2 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧟㧣 㧣 ⚿⺰ 㧠㧝. ii.
(5) ࿑ ⋡ ᰴ 㧞㧚㧝 ࠺࠲ಽ㘃ߩࡊࡠࠬ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧠 㧞㧚㧞 ቯᧁ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧡 㧞㧚㧟 k-fold cross validation㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚9 㧟㧚㧝 AdaBoost ࠕ࡞ࠧ࠭ࡓ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧞 㧠㧚㧝 AdaBoost_M1 ࠕ࡞ࠧ࠭ࡓ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧣 㧠㧚㧞 AdaBoost_M2 ࠕ࡞ࠧ࠭ࡓ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧝㧥 㧡㧚㧝 ಽ㘃ࠪࠬ࠹ࡓߩ◲නߥࡈࡠ࠴ࡖ࠻㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧟 㧡㧚㧞 ⹏ଔࠪࠬ࠹ࡓߩ◲නߥࡈࡠ࠴ࡖ࠻㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧞㧡 㧢㧚㧝 Pruned C4.5 ߣ Un-pruned BC4.5 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚㧟㧜 㧢㧚㧞 Pruned C4.5 ߣ Pruned BC4.5 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚 㧚㧟㧝 㧢㧚㧟 Pruned BC4.5 ߣ Un-pruned BC4.5 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚㧟㧞 㧢㧚㧠 Un-pruned BC4.5 ߣ Un-pruned BC4.5_M1 ߣߩᲧセ 㧚 㧚 㧚 㧚㧟㧠 㧢㧚㧡 Un-pruned BC4.5 ߣ Pruned BC4.5_M1 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚㧟㧡 㧢㧚㧢 Un-pruned BC4.5_M1 ߣ Pruned BC4.5_M2 ߣߩᲧセ 㧚 㧚 㧚 㧚㧟㧢 㧢㧚㧣 Un-pruned BC4.5 ߣ Un-pruned BC4.5_M㧞ߣߩᲧセ 㧚 㧚 㧚 㧚㧟㧢 㧢㧚㧤 Un-pruned BC4.5 ߣ Pruned BC4.5_M㧞ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚㧟㧥 㧢㧚㧥 Un-pruned BC4.5_M2 ߣ Pruned BC4.5_M2 ߣߩᲧセ 㧚 㧚 㧚 㧚㧠㧜. iii.
(6) ⋡ ᰴ 㧞㧚㧝 ࠺࠲࠶࠻ߩ․ᓽ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧞㧣 㧞㧚㧞 C4.5 ߣ BC4.5 ߣߩᲧセ㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧞㧥 㧞㧚㧟 BC4.5 ߣ BC4.5_M1 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧟㧟 㧞㧚㧠 BC4.5 ߣ BC4.5_M2 ߣߩᲧセ 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧚 㧟㧣. iv.
(7) ╙ 1 ┨ ߪߓߦ 1. 1 ⎇ⓥߩ⢛᥊ߣ⋡⊛ ㄭᐕޔᖱႎᛛⴚߩ⊒ዷߦߣ߽ߥ࠺࠲ߩ㊂ߪᜬ⛯⊛ߦჇᄢߒߡࠆߚߩߘޕᖱ ႎ↥ᬺߢߪޔᏂᄢߥ࠺࠲ߩਛ߆ࠄ↪ߥᖱႎޔ⍮⼂ࠍขࠅߔߎߣ㧔࠺࠲ࡑࠗࡦ ࠾ࡦࠣ㧕߇ᔅⷐߦߥߞߡ߈ߚޔߪߣࠣࡦ࠾ࡦࠗࡑ࠲࠺ޕ᭽ߩࡊࠗ࠲ߥޘᏂᄢߥ࠺ ࠲ߩਛ߆ࠄ⑼ޔቇ⺞ᩏޔડᬺ⚻༡▤↥↢ޔℂ߅ࠃ߮Ꮢ႐ಽᨆߦᵴ↪ߔࠆ߇ߢ߈ࠆ⍮ ⼂ࠍߔࠆᣇᴺߢࠆߩ࠲࠺ޔߪࠣࡦ࠾ࡦࠗࡑ࠲࠺ޕਛ߆ࠄ⍮⼂ࠍขࠅߔᣇ ᴺߢࠅ KDD㧔Knowledge Discovery in Databases㧕ߣ߽߫ࠇߡࠆޕ ࠺࠲ࡑࠗࡦ࠾ࡦࠣᚻᴺߦߪޔ᭽ߥޘᚻᴺ߇ࠆ߇ᧄ⎇ⓥߢߪޔㆊߩ࠺࠲ࠍಽ 㘃ߔࠆߎߣߦࠃࠅᧂ⍮ߩ࠺࠲ࠍ੍᷹ߔࠆಽ㘃ࠪࠬ࠹ࡓߦߟߡᵈ⋡ߒߚ⦟ޕಽ㘃 ߪᧂ⍮ߩߦኻߔࠆㆡಾߥ੍ᗐࠍਈ߃ࠆߦߥࠆޔߡߞ߇ߚߒޕಽ㘃ࠪࠬ࠹ࡓߩ♖ ᐲࠍะߐߖࠆ⎇ⓥ߇ⴕࠊࠇߡࠆ⎇ߩߘޕⓥߩ৻ߟߩࠕࡊࡠ࠴ߣߒߡޔಽ㘃ࠪࠬ ࠹ࡓ߇࠺࠲߆ࠄߒߚಽ㘃ⷙೣ㧔⺑㧕ࠍ⚵ߺวࠊߖࠆߎߣߦࠃࠅ♖ޔᐲߩ㜞 ಽ㘃ⷙೣࠍᚑߔࠆᚻᴺ߇ࠆߩߘޕᚻᴺߪࠕࡦࠨࡦࡉ࡞ቇ⠌ߣ߫ࠇࠖ࠹ࠬࡉޔ ࡦࠣ߿ࡃ࠶ࠠࡦࠣߣߞߚᚻᴺ߇ࠆޕ ᧄ⎇ⓥߢߪࡉࠬ࠹ࠖࡦࠣᚻᴺࠍߞߚಽ㘃ࠪࠬ࠹ࡓߩ♖ᐲะߦߟߡᵈ⋡ߒ ߚࠣࡦࠖ࠹ࠬࡉޔߚ߹ޕᚻᴺߪ᭽ߥޘ⒳㘃ߩᚻᴺ߇ࠆ߇ޔಽ㘃ࠪࠬ࠹ࡓߦㆡ↪น ⢻ߥ AdaBoost ࠕ࡞ࠧ࠭ࡓࠍ↪ࠆߎߣߦߒߚޔߪࡓ࠭ࠧ࡞ࠕߩߎޕ᭽ߥޘቇ⠌ ࠪࠬ࠹ࡓߦㆡ↪ߒߚታ㛎ߢ⦟⚿ᨐࠍᓧߡࠆߩ߆ߟߊޔߒ߆ߒޕ㗴߽ࠆߚޕ ߣ߃߫ޟޔㆡ↪ߔࠆࠪࠬ࠹ࡓߩ♖ᐲ߇ૐߔ߉ࠆޟޔޠㆡ↪ߔࠆ࡞࡞߇ⶄ㔀ߔ߉ࠆޔޠ ޟቇ⠌ߔࠆ࠺࠲ߦ㑆㆑ߞߚ࠺࠲߇ᄙߊሽߔࠆߩߤߥޠ႐วߪ߹ࠅലᨐ⊛ߢߪ. 1.
(8) ߥߎߣ߇ಽߞߡࠆޕߩࡉࠬ࠹ࠖࡦࠣ⎇ⓥߩ৻ߟߦߩࠄࠇߎޔ㗴ࠍ⸃ߔ ࠆߚߦቇ⠌ࠪࠬ࠹ࡓߣࡉࠬ࠹ࠖࡦࠣᚻᴺߣߩ㑐ଥߦ㑐ߔࠆ⎇ⓥ߇ߥߐࠇߡࠆޕ ᧄ⺰ᢥߢߪߎߩ⎇ⓥߦᵈ⋡ߒߡࠆޕ. 1. 2 ᧄ⎇ⓥߩ⋡⊛ ᧄ⎇ⓥߩᄢ߈ߥ⋡⊛ߪޔಽ㘃ࠪࠬ࠹ࡓߩࠛ₸ߩᷫዋߐߖࠆߎߣߢࠆߚߩߘޕ ಽ㘃ࠪࠬ࠹ࡓߩ♖ᐲࠍะߐߖࠆᣇᴺߩ৻ߟߢࠆࡉࠬ࠹ࠖࡦࠣᚻᴺߩ⺞ᩏࡉޔ ࠬ࠹ࠖࡦࠣᚻᴺߣಽ㘃ࠪࠬ࠹ࡓߣߩ㑐ଥߦߟߡ⺞ᩏߩࠣࡦࠖ࠹ࠬࡉޔᡷ⦟ߩឭ ᩺ࠍ⋡ᮡߣߒߡ⎇ⓥࠍㅴߚߩߘޕᣇᴺߣߒߡޔቯᧁࠍ↪ߚಽ㘃ࠪࠬ࠹ࡓߢࠆ C4.5 ߇↢ᚑߔࠆ♖ᐲߩ⇣ߥࠆ 2 ⒳㘃ߩቯᧁ㧔߭ߣߟߪ̌un-pruned tree̍ߣ߫ ࠇࠆⷙೣߦ࠲࠺ߥ⊛⥸৻ࠍࠇߘߪߟ৻߽ޔኻᔕ᧪ࠆࠃ߁ߦߦߒߚⷙೣ̌pruned tree̍㧕ߦኻߒߡ AdaBoost ࠍㆡ↪ߒ♖ߩߘޔᐲࠍᲧセߔࠆߎߣࠍⴕߞߚޔߚ߹ޕ AdaBoost ࠍᡷ⦟ߒߚࠕ࡞ࠧ࠭ࡓࠍ 2 ⒳㘃⠨᩺ߒߚޕᡷ⦟ࠍⴕߞߚࠪࠬ࠹ࡓ߽ C4.5 ߩ 2 ⒳㘃ߩቯᧁߦㆡ↪ߒߘࠇߙࠇᲧセࠍⴕߞߚޕ. 1. 3 ᧄ⺰ᢥߩ᭴ᚑ ᧄ⺰ᢥߪޔ7 ߟߩ┨ࠃࠅߥࠆޕ㧞┨ߪಽ㘃ࠪࠬ࠹ࡓߦ․ޔቯᧁࠍ↪ߚಽ㘃ࠪࠬ ࠹ࡓߢࠆ C4.5 ߦߟߡㅀߴߚޔߚ߹ޕಽ㘃ࠪࠬ࠹ࡓߩ⹏ଔᣇᴺߦߟߡ߽⚫ߒ ߚޕ3 ┨ߪࡉࠬ࠹ࠖࡦࠣߦߟߡߩ⚫ࠍߒߚࠣࡦࠖ࠹ࠬࡉޔߚ߹ޕᚻᴺߩ৻ߟ ߢࠆ AdaBoost ߩࠕ࡞ࠧ࠭ࡓߦߟߡㅀߴߚޕ4 ┨ߪ AdaBoost ߩᡷ⦟ߦߟߡ ߩឭ᩺ߦߟߡㅀߴࠆޔߚ߹ޕAdaBoost ࠍᡷ⦟ߒߚࠕ࡞ࠧ࠭ࡓࠍ 2 ⒳㘃⚫ߒߚޕ 5 ┨ߪᧄ⎇ⓥߢ㐿⊒ߒߚࠪࠬ࠹ࡓߩⷐߦߟߡ⸥ߒߚޕ6 ┨ߢߪታ㛎ߢ↪ߒߚ࠺ ࠲ౕޔ⊛ߥታ㛎ߩㅴᣇޔታ㛎⚿ᨐ⚿ޔᨐߦኻߔࠆ⠨ኤࠍ⸥ߒߚޕ7 ┨ߢߪᧄ⎇ ⓥߦኻߔࠆ⚿⺰ࠍㅀߴߚޕ. 2.
(9) ╙ 2 ┨ ಽ㘃ࠪࠬ࠹ࡓ 2. 1 ಽ㘃ࠪࠬ࠹ࡓߣߪ ಽ㘃ࠪࠬ࠹ࡓߪቇ⠌↪࠺࠲࠶࠻(⸠✵࠺࠲)߆ࠄಽ㘃ⷙೣࠍ᭴▽ߒ↪ࠍࠇߘޔ ߡᧂ⍮࠺࠲࠶࠻ߦ੍᷹⚿ᨐ(ࠢࠬࡌ࡞)ࠍਈ߃ࠆࠪࠬ࠹ࡓߢࠆޕಽ㘃ࠪࠬ ࠹ࡓߦߪޔቯᧁޔࠢࡢ࠻࠶ࡀ࡞ࡘ࠾ޔNearest Neighbor ߥߤ߇ࠆޕ ࠺࠲ߩಽ㘃ߦߪޔ2 ߟߩࠬ࠹࠶ࡊ߇ࠆࡊࡦࠦߪߚ߹࡞ࡌߩ࠲࠺ޔߕ߹ޕ ࠻ߦၮߠߚಽ㘃ⷙೣࠍ᭴▽ߔࠆߩߘޕಽ㘃ⷙೣߪޔዻᕈߦࠃߞߡࠊߌࠄࠇߡࠆ࠺ ࠲ࡌࠬౝࠍಽᨆߔࠆߎߣߦࠃߞߡᓧࠄࠇࠆࠬࡌ࠲࠺ޔߚ߹ޕౝߩฦ࠺࠲ߦ ߪ⸳ߓ߆ࠄޔቯߐࠇߚ৻ߟߩዻᕈ㧔ࠢࠬዻᕈ㧕⸳߇࡞ࡌޔቯߐࠇߡࠆߎޕ ߩࠃ߁ߥ࠺࠲㓸วࠍ࠺࠲࠶࠻ޔߥ߁ࠃߩߎޕ߁⸒ߣߤߥ࠻ࠢࠚࠫࡉࠝޔฦ ⸠✵࠺࠲ߦࠢࠬࡌ࡞߇ߔߢߦਈ߃ࠄࠇߡࠆ‛ࠍ↪ߒߡቇ⠌ࠍⴕ߁ࠪࠬ࠹ ࡓߪޔᢎᏧઃ߈ቇ⠌ߣ߫ࠇߡࠆ৻ޕᣇߥ߁ࠃߩߎޔቇ⠌ߣߪߦᢎᏧήߒቇ⠌㧔ࠢ ࠬ࠲ࡦࠣ㧕ߣ߫ࠇࠆಽ㘃ࠪࠬ࠹ࡓ߇ࠆߩࠬࡌ࠲࠺ޔߪࠇߎޕฦ࠺࠲ߦ ࠄࠇߚࠢࠬࡌ࡞ߪࠄࠇߡߥޔߚ߹ޕಽ㘃ߐࠇࠆߴ߈ࠢࠬߩዻᕈ߿ ᢙߥߤߪޔቇ⠌⺖⒟ߢࠄࠇߡߊ⺰ᧄޕᢥߢߪޔᢎᏧઃ߈ቇ⠌ߦߟߡߩߺขࠅ ߍߡࠆޕ ㅢᏱޔቇ⠌ߐࠇߚࡕ࠺࡞ߪಽ㘃࡞࡞ߩᒻߢ␜ߐࠇߡࠆޔ߫߃ߣߚޕቯᧁ߹ߚ ߪᢙᑼߥߤߢࠆޔߢߎߎޕ࿑ ࿑ 2㧚1㧔㨍㧕ࠍ߽ߣߦࠍߍࠆߕ߹ޕቇ⠌↪ߩ࠺ ࠲࠶࠻ߪޔ㘈ቴߩࠢࠫ࠶࠻ᖱႎߢࠆߩߎޕ႐วࠢࠬዻᕈߪ̌credit_rating̍ ߦ⸳ቯߐࠇߡࠆߩߢޔᚑߐࠇࠆ࡞࡞ߪ̌fair̍߹ߚߪ̌excellent̍ࠍዉ߈ߒ ߡࠆޕᚑߐࠇߚ࡞࡞ߪᧂޔ⍮ߩ࠺࠲ߩ੍᷹ߦ↪ߐࠇࠆޕ. 3.
(10) 㧔㨍㧕 ಽ㘃ࠪࠬ࠹ࡓ ⸠✵࠺࠲. name Sandy Bill Courtney Susan Claire Andre. age <=30 <=30 31…40 >40 >40 31…40. incom low low high med med high. credit_rating fair excellent excellent fair fair excellent. ಽ㘃ⷙೣ. If age=”31…40” and income = high Then Credit_rating = excellent. 㧔㨎㧕 ಽ㘃ⷙೣ. ࠹ࠬ࠻࠺࠲. ᣂߒ࠺࠲ (John, 31…40, high). name Frank Sylvia Anne. age >40 <=30 31…40. incom high low high. credit_rating fair fair excellent. Credit rating?. excellent. (a) ቇ⠌㧦ಽ㘃ࠪࠬ࠹ࡓߪࠍ࠲࠺✵⸠ޔಽᨆߒಽ㘃ⷙೣࠍ↢ᚑߔࠆޕ (b) ಽ㘃㧦࠹ࠬ࠻࠺࠲ߪޔಽ㘃ⷙೣߩ⹏ଔߦ↪ࠄࠇࠆ⹏ߩߘߒ߽ޕଔ߇ḩ⿷ߩⴕߊ߽ߩߞ ߛߞߚߥࠄ߫ޔᣂߒ࠺࠲ࠍㆡ↪ߒߚ႐ว߽ḩ⿷ߩⴕߊ੍᷹߇᧪ࠆޕ. ࿑㧞㧚㧝 ࠺࠲ಽ㘃ߩࡊࡠࠬ. ߟ߉ߩࠬ࠹࠶ࡊߪޔ࿑ ࿑ 2㧚1㧔㨎㧕ߢࠆޕᚑߐࠇߚಽ㘃ⷙೣߪߩ࠲࠺ޔಽ㘃 ߦ↪ߐࠇࠆޕ࿑ ࿑ߩᏀߩᬺߪޔಽ㘃ⷙೣߩ੍᷹♖ᐲ߇⹏ଔߐࠇߡࠆޕಽ㘃ࠪࠬ ࠹ࡓߩ⹏ଔߦߟߡߪޔ2.3 ߢ⺑ߔࠆ⹏ߡ↪ࠍ࠲࠺✵⸠ޔߒ߽ޕଔࠍߒߚ႐ว ߪߩߘޔಽ㘃ⷙೣߪߘߩ࠺࠲ߦ․ൻߒߚⷙೣߥߩߢߎࠇߪ⹏ޔଔߣߪ߃ߥࠃޕ ߞߡޔߒ߽ޕࠆߔ↪ࠍ‛ߚߞߥ⇣ߪߣ࠲࠺✵⸠ޔߪ࠲࠺࠻ࠬ࠹ࠆ↪ߢߎߎޔ. 4.
(11) ಽ㘃ེߩ⹏ଔ߇ḩ⿷ߩߊ‛ߢߞߚߥࠄ߫ᧂޔ⍮ߩ࠺࠲㧔ࠢࠬࡌ࡞߇ಽࠄߥ ࠺࠲㧕ߩࠢࠬࡌ࡞ࠍ㜞♖ᐲߢ੍᷹ߔࠆߎߣ߇᧪ࠆޕ ಽ㘃ߩᣇᴺߪޔ᭽ߥޘᣇᴺ߇ࠆ߇⺰ᧄޔᢥߢ↪ߔࠆቯᧁߦࠃࠆಽ㘃ࠪࠬ࠹ࡓ 㧔C4.5㧕ߦߟߡ⚫ߔࠆޕ. 2. 2 ቯᧁߦࠃࠆಽ㘃㧔C4.5㧕 ቯᧁߣߪޔ࿑ 2㧚2 ߩࠃ߁ߥᧁ᭴ㅧߩࡈࡠ࠴ࡖ࠻ߢࠆޕᨑߩಽ߆ࠇ⋡㧔node㧕 ߇࠺࠲ߩዻᕈߩߎߘޔᨑ㧔branch㧕߇ߘߩዻᕈߩ୯ߩ⪲ߡߒߘޔㇱಽ㧔leaf㧕߇ࠢ ࠬߩࡌ࡞ߣߥߞߡࠆߣߊߦ⇟৻ޔరߦߥࠆࡁ࠼߇ᩮߩㇱಽ㧔root㧕ߣߥߞߡ ࠆޕ࿑ ࿑ 2㧚2 ߪߣߒ߶ࠍ࠲ࡘࡇࡦࠦޔᕁߞߡࠆ㘈ቴ߇ࠦࡦࡇࡘ࠲ࠍ⾈߁߆ ⾈ࠊߥ߆ߣ⸒߁ࠦࡦࡊ࠻ߢᚑߐࠇߚቯᧁߢࠆߩߎޕቯᧁࠍ↪ߒߡᧂ⍮ ߩ࠺࠲ࠍ੍᷹ߔࠆߣ߈ߪߩ࠻࡞ߕ߹ޔㇱಽ㧔ߎߎߢߪ ̌age̍㧕߆ࠄಽ㘃ࠍᆎ ᰴߩࡁ࠼߳ߣಽ㘃ࠍㅴᦨޔᓟߩࡈߩㇱಽߢ⚿ᨐ߇ߢࠆߦ߁ࠃߩߎޕቯᧁߪޔ ࠾ࡘ࡞ࡀ࠶࠻ࡢࠢߥߤߣߪ⇣ߥࠅޔ㕖Ᏹߦࠊ߆ࠅ߿ߔ̌࡞࡞̌ࠍᚑߔࠆ ߎߣ߇᧪ࠆޕᚑߐࠇߚ࡞࡞ߪޔ㨬IF̖THEN̖㨭ߩࠃ߁ߥ৻⥸⊛ߥ⸒⺆ߢ◲නߦ ߔߎߣ߇᧪ࠆޕ ಽ㘃ེࠍ↪ߒߚಽ㘃ࠪࠬ࠹ࡓߦߪޔID3ޔCARTޔC4.5 ߥߤ߇ࠆ⺰ᧄޕᢥߢߪޔ C4.5 ࠍಽ㘃ࠪࠬ࠹ࡓߣߒߡ↪ߔࠆߩࠇߘߢߎߘޕേߦߟߡޔ2.2.1 ߢቯᧁߩ ᚑࠍ⺑ߔࠆߩߎޔߚ߹ޕಽ㘃ࠪࠬ࠹ࡓߪߩᧁޔᨑಿࠅࠍⴕ߁ޔߪࠇߘޕᚑߒߚ ᧁߪᨑߩᢙ߇ᄙߊ⸠✵࠺࠲ߦ㕖Ᏹߦ․ൻߒߚ࡞࡞ߦߥߞߡࠆ(ㆊቇ⠌)߆ࠄߢ ࠆߩߎޕᣇᴺࠍ 2.2.2 ߢ⺑ࠍߔࠆޕ. age㧫 <=30. 30…40. sutudent? no. no. >40. yes. credit_rating?. yes. excellent. yes. no. ࿑㧞㧚㧞 ቯᧁ. 5. fair. yes.
(12) 2.2.1. ቯᧁߩᚑ. ቯᧁᚑߪߩ࠼ࡁߕ߹ޔቯ߆ࠄᆎ߹ࠆߒ߽ޕតᩏߒߡࠆ࠺࠲࠶࠻ౝߩ ࠢࠬࡌ࡞߇ߔߴߡหߓ⒳㘃ߛߞߚߥࠄ߫ߥߢ߁ߘޕࠆߥߦࡈߪ࠼ࡁߩߘޔ ႐วߪߦ࠼ࡁߩߘޔᒰߡߪ߹ࠆ̌ዻᕈ̍ࠍㆬᛯߔࠆߩߎޕᬺࠍ➅ࠅߒⴕߞߡ ߔߴߡࡈߦߚߤࠅ⌕ߚࠄቇ⠌ࠍ⚳ੌߐߖࠆޕ C4.5 ߪޔዻᕈߩቯߩߚߦᓧᲧ(Gain Ratio)ߣ߁⹏ଔ㑐ᢙࠍ↪ߡㆬᛯࠍߔ ࠆߩߎޕᣇᴺߦߟߡએਅߢ⺑ࠍߔࠆޕ 㓸ว sߢ᭴ᚑߐࠇߚ㓸ว S ߦߟߡ⠨߃ࠆޕ㓸ว S ߩࠢࠬዻᕈߪ ޔm ߩࠢࠬ 㧔C. {C 1,...,C m }㧕ࠍᜬߞߡࠆߣߒ ޔsiߪ㓸วౝߩࠢࠬ C iࠍࠨࡐ࠻ߔࠆ࠺࠲. ߩᢙߣߔࠆޔߣࠆߔ߁ߘޕ㓸ว S ࠍߔߚߦᔅⷐߥᖱႎ㊂ߪޔ m. I(s1,s2 ,...,sm ) ¦ pi log2 (pi) 㧔㧞㧚㧝㧕 i1. ߣߥࠆޔߢߎߎޕpiߪ㓸ว S ߩਛ߆ࠄ C iߩ࠺࠲࠶࠻ࠍㆬ߮ߔ⏕₸ߢࠆ㧔 si /S 㧕ޕ ߹ߚޔlog ߩᐩ߇㧞ߣߥߞߡࠆߩߪࡆ࠶࠻ߦࠛࡦࠦ࠼ߐࠇߡࠆ߆ࠄߢࠆޕ 㧔 log2 (8)ߥࠄ߫㧟ࡆ࠶࠻㧕 ߟ߉ߦ n ߩ୯ࠍᜬߟዻᕈ A㧔 A {a1,...,an}㧕ߦߟߡ⠨߃ࠆޕዻᕈ A ߪޔ㓸ว S ࠍ n ߩࠨࡉ࠶࠻ {S1,...,Sn}ߦಽߌࠆߎߣ߇᧪ࠆߩߢߎߎޕ㓸ว S j ߪޔ㓸ว S ਛߩ. ዻᕈ A ߩ a jࠍᜬߟ‛ߢࠆߢߎߎޕዻᕈ A ߇ࡁ࠼ߩዻᕈߣߥߞߚ႐วߪߩࠄࠇߎޔ ࠨࡉ࠶࠻ߪ㓸ว S ࡁ࠼߆ࠄᨑಽ߆ࠇߒߡߩࡁ࠼ߦߥࠆޕᰴߩᬺߪޔዻᕈ A ߇ࡁ࠼ߩዻᕈߣߥߞߚ႐วߩࠛࡦ࠻ࡠࡇࠍ᳞ࠆޔߢߎߎޕsijࠍࠨࡉ࠶࠻ S j ౝ ߩ C iࠍࠨࡐ࠻ߔࠆ࠺࠲ߩᢙߣߔࠆߣࠛࡦ࠻ࡠࡇߪ n. E (A). ¦. j1. s1 j ... sm j s. I(s1 j ... sm j) 㧔㧞㧚㧞㧕. ߎߎߢߩ s1 j ... sm j sߪޔ㓸ว S ౝߢߩࠨࡉ࠶࠻ S j ߩ㗫ᐲߦኻߔࠆ㊀ߺߠߌߢ ࠆޔߚ߹ޕI(s1 j ... sm j)ߪ ࠻࠶ࡉࠨޔS j ࠍߔߚߦᔅⷐߥᖱႎ㊂ߢࠅޔᑼ㧔㧞㧚 㧝㧕ߣห᭽ߩᣇᴺߢ᳞ࠆߎߣ߇᧪ࠆޔߦࠄߐޕ⸥ߩ㧞ᑼࠍ↪ߒߡዻᕈ A ࠍࡁ ࠼ߣߒߚ႐วߩᓧ(Gain)ࠍ᳞ࠆޔߪࠇߎޕᰴߩࠃ߁ߦߥࠆޕ. G ain(A) I(s1,s2 ,...,sm ) E (A) 㧔㧞㧚㧟㧕 C4.5 ߩ೨ㅴߢࠆ ID3 ߣ߁ಽ㘃ࠪࠬ࠹ࡓߢߪ⹏ࠍࠇߎޔଔ㑐ᢙߣߒߡ߽ߜߡಽ. 6.
(13) 㘃ࠍⴕߞߡߚ⹏ࠍࠇߎޕଔ㑐ᢙߣߒߡ↪ߔࠆߎߣߢ⦟⚿ᨐ߇ᓧࠄࠇࠆ߇ޔᄙᢙ ߩ୯ࠍߣࠆዻᕈࠍ㊀ⷞߔࠆะ߇ࠆߩ࠻ࠬޔ߫߃ߣߚޕਛߩฬ೨ߥߤߢࠆߎޕ ࠇࠍࡁ࠼ߣߒߡߒ߹߁ߣ㧝߆ࠄߥࠆߚߊߐࠎߩㇱಽ㓸ว߇᧪ߡߒ߹߁ࠇߎޕ ࠍ࿁ㆱߔࠆߚߦ C4.5 ߢߪ৻⒳ߩᱜⷙൻࠍⴕ߁ޔߪࠇߎޕᄙᢙߩ୯ࠍขࠆߎߣߦࠃ ߞߡᓧࠄࠇߚᓧㇱಽࠍ⺞ᢛߔࠆߎߣߢࠆࠆޕߦ㑐ߒߡࠬࠢߩߤ߇ࠇߘޔ ߦዻߔࠆ߆ߢߪߥߊ⚿࠻ࠬ࠹ߩߘޔᨐ⥄ࠍવ߃ࠆࡔ࠶ࠫߩᖱႎ㊂ࠍ⠨߃ࠆߎޕ ࠇߪޔએਅߩᑼߢߔߎߣ߇᧪ࠆޕ n. si s log2 i 㧔㧞㧚㧠㧕 s i1 s. Split_ info(A) ¦. ߎࠇߪޔ㓸ว S ࠍ n ߩㇱಽ㓸ว߳ಽഀߔࠆߎߣߦࠃߞߡᓧࠄࠇࠆోᖱႎ㊂ࠍߒߡ ࠆ৻ޕᣇᖱႎ㊂ᓧ(Gain)ߪࠬࠢޔಽߌߦ㑐ࠊࠆㇱಽߩᖱႎ㊂ࠍࠄࠊߔߞࠃޕ ߡޔዻᕈ A ߇ࡁ࠼ߩዻᕈߣߥߞߚ႐วߩᓧᲧ㧔Gain Ratio㧕ߪᑼ㧞㧚㧟ߣᑼ㧞㧚 㧠ࠍ↪ߡ 㧔㧞㧚㧡㧕 G ain _ Ratio(A) G ain(A) Split_ info(A) ߣߥࠆޕ C4.5 ߪߥ߁ࠃߩߎޔ㗅ᐨߢ㓸วౝߩߔߴߡߩዻᕈߦߟߡᓧᲧࠍ⸘▚ߒ߽ߞߣ ߽㜞⹏ଔ߇ߢߚዻᕈࠍࡁ࠼ߩዻᕈߣߒߡㆬࠎߢࠁߊޕ. 2.2.2. ᧁߩᨑಿࠅ㧔Tree Pruning㧕. ᚑߒߚቯᧁߪ✵⸠ޔ㓸วࠍಽ㘃ߔࠆߚߦ㕖Ᏹߦᄙߊᨑಽ߆ࠇߒᄙጘߦߥߞߡ ࠆߩߎޔߪࠇߎޕ㓸วߦߩߺ․ൻ(ㆊቇ⠌)ߒߚᧁߢࠆᧂޕ⍮ߩ࠺࠲ߦ߽ኻᔕߢ ߈ࠆࠃ߁ߦⶄ㔀ߦߥࠅߔ߉ߚᧁߩᨑࠍಿࠆᬺ߇ᔅⷐߦߥࠆޕ ߢߪߦ߁ࠃߩߤޔᨑࠍಿࠆߩߛࠈ߁߆ޕቯᧁߩᨑಿࠅߩᣇᴺߪޔ㧞⒳㘃ߩᣇᴺ߇ ࠆޕ 㧝ߟ⋡ߩᣇᴺߪޔ ̌pre-pruning̍ߣ߫ࠇࠇߘޔએ⸠✵ߩ㓸วࠍಽഀߒߥ ߎߣࠍቯߔࠆᣇᴺߢࠆޔߪࠇߎޕන⚐ൻߔࠇ߫ਇⷐߦߥࠆ᭴ㅧࠍࠆߚߩᤨ㑆 ࠍᶉ⾌ߒߥߣ߁․ᓽ߇ࠆߥ⊛ဳౖޕᣇᴺߢߪޔㇱಽ㓸วࠍಽഀߔࠆ߽ߞߣ߽⦟ ᣇᴺࠍ⺞ߴߥ⊛⸘⛔ޔ㊀ⷐᕈޔᖱႎᓧࠅ⺋ޔᷫዋᕈ⹏ࠍޘ╬ޔଔߔࠆߎޔߒ߽ޕ. 7.
(14) ߩ⹏ଔ߇ࠄ߆ߩ㑣୯ࠃࠅૐߊߥࠇ߫ߩߘޔಽഀߪළਅߐࠇࠆߩߘޔߡߒߘޕㇱಽ㓸 วߦኻߔࠆᧁߪᦨㆡߥ⪲ߣ⸒߁ߎߣߦߥࠆߩߎޔߒ߆ߒޕᣇᴺߪ㑣୯ߩ⸳ቯ߇ߣߡ߽ 㔍ߒޕቯᧁࠍᚑߔࠆߣ߈ޔ㑣߇㜞ߔ߉ࠇ߫න⚐ߔ߉ࠆ᳇ߦߥߞߡߒ߹ޔૐߔ ߉ࠇ߫ోߊߎߩᣇᴺ߇ᤋߐࠇߥߣ⸒ߞߚ㗴߇ࠆޕ 㧞ߟ⋡ߩᣇᴺߪ̌ޔpost-pruning̍ߣࠃ߫ࠇޔ᭴▽ߐࠇߚ᭴ㅧߩߊߟ߆ࠍㆊߦ ߐ߆ߩ߷ߞߡ㒰ߔࠆᣇᴺߢࠆߩߎޕᣇᴺߪޔㅢᏱߤ߅ࠅᧁࠍᚑߒߡ߆ࠄޔᚑ ߐࠇߚᧁߩ㗅ᔕߒߔ߉ߚㇱಽࠍಿࠅขߞߡߊߦߣޕಾࠅᝥߡࠄࠇࠆᧁߩㇱಽࠍ ࠆߚߦ⾌߿ߐࠇࠆ⸘▚㊂ߪᧄ⾰⊛ߥ㗴ߣߥࠆ↢ࠍᧁޔߒ߆ߒޕ㐳ߐߖߚߣߢᨑ ಿࠅࠍⴕ߁ߎߣߪᤨޔ㑆ߪ߆߆ࠆ߇ࠃࠅା㗬ߢ߈ࠆᣇᴺߢࠆޕ. 2. 3 ಽ㘃ࠪࠬ࠹ࡓߩ⹏ଔᣇᴺ ಽ㘃ࠪࠬ࠹ࡓߩᱜ⏕ᕈࠍⓍ߽ࠆߎߣߪߩߘޔಽ㘃ࠪࠬ࠹ࡓ߇ᚑߒߚ࡞࡞߇ᧂ ⍮ߩ࠺࠲ߦߚߔࠆ੍᷹♖ᐲࠍ⍮ࠆߚߦ߽㊀ⷐߢࠆޔߣࠆߍࠍޕએ೨ߩ ࡞ࠬ࠺࠲ࠍ↪ߡಽ㘃ࠪࠬ࠹ࡓߦ⸠✵ߐߖ㘈ቴߩ⾈߁‛ࠍ੍᷹ߔࠆߣ߈੍ߩߘޔ ᷹߇ታ㓙ߩ㘈ቴߦߚߒߡߤߩ⒟ᐲߩା㗬ᐲ߇ࠆ߆ࠍ⺞ߴࠆᤨ߿ޔಽ㘃ࠪࠬ࠹ࡓߩ ᕈ⢻ࠍᲧセߔࠆߣ߈ߥߤߪޔಽ㘃ࠪࠬ࠹ࡓߩ⹏ଔߪߣߡ߽㊀ⷐߦߥࠆޔߪߢߎߎޕಽ 㘃ࠪࠬ࠹ࡓߩ৻⥸⊛ߥ⹏ଔᣇᴺޔholdout ߣ cross-validation ࠍ⚫ߔࠆޕ Holdout ߪߕ߹ޔਈ߃ࠄࠇߚ࠺࠲ࠍࡦ࠳ࡓߦ⸠✵࠺࠲ߣ࠹ࠬ࠻࠺࠲ߩ㧞ߟ ߦಽഀߔࠆߪ࠲࠺✵⸠ޔߦ․ޕ೨࠺࠲ߩ㧟ಽߩ㧞ࠍഀࠅᒰߡޔᱷࠅࠍ࠹ࠬ࠻࠺ ࠲ߦഀࠅᒰߡࠆࠍ࠲࠺✵⸠ޔߡߒߘޕಽ㘃ࠪࠬ࠹ࡓߦቇ⠌ߐߖ࠹ࠬ࠻࠺࠲ߣߦ ᚑߐࠇߚ࡞࡞ߩ⹏ଔࠍⴕ߁⹏ߡߒ↪ࠍ࠲࠺ߚ↪ߦ✵⸠ޕଔࠍߔࠆࠃࠅߪޔ ⦟⹏ଔ߇ߢ߈ࠆޔߦࠄߐޕ㨗࿁หߓᬺࠍ➅ࠅߒⴕߘߩᐔဋ୯ࠍߘߩಽ㘃ࠪࠬ ࠹ࡓߩ⹏ଔߣߔࠆޕ k-fold Cross-validation ߪޔਈ߃ࠄࠇߚ࠺࠲ࠍࡦ࠳ࡓߦ㨗ߩห㊂ߩ࠺࠲ ࠶࠻ߦಽഀߔࠆޕಽഀߐࠇߚ࠺࠲࠶࠻ࠍ S1,S2 ,...,Sk ߣߔࠆߦࠄߐޕᰴߩࠃ߁ߥᴺ ೣߢ ࠍ࠻ࠬ࠹ߣ✵⸠ޔk ࿁➅ࠅߒⴕ߁ޕi ࿁⋡ߩᬺߢ࠺࠲࠶࠻ Siࠍ࠹ࠬ࠻࠺ ࠲ߣߒߡ↪ࠆޔߡߒߘޕᱷࠅࠍ⸠✵࠺࠲ߣߔࠆߡ↪ࠍ࠲࠺✵⸠ޕಽ㘃ࠪࠬ ࠹ࡓቇ⠌ࠍߐߖࠆޕᚑߐࠇߚ࡞࡞ߪ⹏ߡ↪ࠍ࠲࠺࠻ࠬ࠹ޔଔߐࠇࠆ߹ߎߎޕ. 8.
(15) ߢߩᬺࠍ➅ࠅߒⴕ⚿ߩࠄࠇߘޕᨐߩᐔဋ߇ಽ㘃ࠪࠬ࠹ࡓߩ⹏ଔߣߥࠆޔߚ߹ޕ ㅢᏱߩಽ㘃ࠪࠬ࠹ࡓߩ⹏ଔߢߪޔ10-fold cross-validation ߇ㅴࠄࠇߡࠆ⺰ᧄޕᢥ ߢߪ⹏ߩߎޔଔᴺࠍ 10 ࿁ⴕ߁ߎߣߦࠃࠅಽ㘃ࠪࠬ࠹ࡓࠍ⹏ଔߒߚޕ. Data. ࠹ࠬ࠻࠺࠲. S1. S2. ̖̖̖. Sk. ⸠✵࠺࠲. ಽ㘃ࠪࠬ࠹ࡓ. ࡞࡞. ⹏ଔ㧝. k-fold cross-validation ߩ 1 ࿁⋡ߦⴕࠊࠇࠆᬺࠍ࿑ߦ␜ߒߚߩߎޕᬺࠍ k ࿁ⴕ⹏ଔߩᐔဋ ߇ಽ㘃ེߩ⹏ଔߣߥࠆޕ. ࿑ 2㧚3㧦k-fold cross-validation. 9.
(16) ╙ 3 ┨ ࡉࠬ࠹ࠖࡦࠣ(Boosting) 3. 1 ࡉࠬ࠹ࠖࡦࠣߣߪ ࡉࠬ࠹ࠖࡦࠣߦၮᧄ⊛ߥ⠨߃ᣇߪߺ⚵ࠍ࡞࡞ޔวࠊߖߡࠃࠅ⦟࡞࡞ࠍࠆ. ߣ߁‛ߢࠆ{ ࠍ࡞࡞ޕh1,h2 ,...,hT }ߣߔࠆߣ⚵ߺวࠊߐࠇߚ࡞࡞ߪᰴߩࠃ߁ߦ ߔߎߣ߇᧪ࠆޕ T. f(x). ¦ D h (x) t t. t1. D tߪ htߩଥᢙߣߥߞߡࠆޔߚ߹ޕD t,ht ߪࡉࠬ࠹ࠖࡦࠣߩቇ⠌⺖⒟ߢ⸘▚ߐࠇࠆޕ ߎߩࠃ߁ߥࠕ࡞ࠧ࠭ࡓߪࠕࡦࠨࡦࡉ࡞ቇ⠌ࠕ࡞ࠧ࠭ࡓߣ߫ࠇߡࠆޔߦ߆߶ޕ Bagging ߿ Arcing ߥߤߩࠕ࡞ࠧ࠭ࡓ߇ߍࠄࠇࠆޕ ࡉࠬ࠹ࠖࡦࠣߩ࡞࠷ߪ㧼㧭㧯ቇ⠌⎇ⓥߦࠆ⎇ߩߘޕⓥߩਛߢޔKearns ߣ Valiant ߇ࠅࠃࡓ࠳ࡦޟޔ㜞♖ᐲߩቇ⠌ᯏ᪾ࠍ⚵ߺวࠊߖࠆߎߣߦࠃߞߡߘࠇࠃ ࠅ߽⦟ቇ⠌ᯏ᪾ࠍࠆߎߣ߇น⢻ߢࠆߚߒ⸽┙ࠍߣߎ߁⸒ߣޠ߆ࠄᆎ߹ߞߚޕ 1989 ᐕ㧘Schapire ߦࠃߞߡℂ⺰⊛ߦ⸽ߐࠇߚࡉࠬ࠹ࠖࡦࠣ߇⠨᩺ߐࠇߚߩߎޕ ೋᦼߩࡉࠬ࠹ࠖࡦࠣߪ ߚߒߣࠬࡌࠍࠢࡢ࠻࠶ࡀ࡞ࡘ࠾ޔOCR ߦㆡ↪ߒߚ ታ㛎߇ⴕࠊࠇߡࠆޔߒ߆ߒޕታߦࠆ᭽ߥޘቇ⠌ࠕ࡞ࠧ࠭ࡓ߳ߩㆡ↪ߦߪ᭽ޘ ߥ㗴ࠍᛴ߃ߡߚޕ 1995 ᐕ㧘Freund ߣ Schapire ߦࠃߞߡታߦࠆቇ⠌ࠕ࡞ࠧ࠭ࡓ߳ߩㆡ↪߇น ⢻ߥ AdaBoost (࿑ 2.1)߇㐿⊒ߐࠇߚߩࡓ࠭ࠧ࡞ࠕߩߎޕേߪએਅߩ▵ߢ⺑ߔࠆޕ ߎߩࠕ࡞ࠧ࠭ࡓߪޔℂ⺰ߢߪቇ⠌ࠍㅴߡߊ߁߃ߢㆊቇ⠌ࠍߐߌࠆߎߣ߇᧪ ࠆߣߐࠇߡࠆ߇ߣࠆߡߞ߇࠭ࠗࡁߦ࠲࠺✵⸠ޔࠅࠃ⚿ᨐ߇ᓧࠄࠇߥ. 10.
(17) ߇ታ㛎ߦࠃߞߡ␜ߐࠇߡࠆޕ ᦨㄭߩࡉࠬ࠹ࠖࡦࠣ⎇ⓥߪࠅࠃޔലᨐ⊛ߥࠕ࡞ࠧ࠭ࡓߩ㐿⊒ࠆߥߣࠬࡌޔቇ ⠌ᯏ᪾ߩᕈ⢻ߣߩ㑐ଥߩ⸃ߥߤ߇ࠆ⎇ᧄޕⓥߢߪࠆߥߣࠬࡌޔቇ⠌ᯏ᪾ߩᕈ⢻ ߣ AdaBoost ߣߩ㑐ଥߩಽᨆߦᵈ⋡ߒߚޔߚ߹ޕAdaBoost ࠍᡷ⦟ߒߘࠇߣߩ㑐ଥߦ ߟߡ߽ಽᨆࠍⴕߞߚޕએਅߦޔAdaBoost ࠕ࡞ࠧ࠭ࡓޔᡷ⦟ߦߟߡㅀߴࠆޕ. 3. 2 AdaBoost AdaBoost㧔Adaptive Boosting㧕ߪޔೋᦼߩࡉࠬ࠹ࠖࡦࠣߩ㗴ὐࠍ⸃ߔߴߊ Freund ߣ Schapire ߦࠃߞߡឭ᩺ߐࠇߚࠕ࡞ࠧ࠭ࡓߢࠆޕAdaBoost ߪޔ࿑ ࿑㧟㧚 㧝ߦ␜ߔࠃ߁ߥࠕ࡞ࠧ࠭ࡓߢࠅߩߘޔၮᧄേߪએਅߢ⺑ߔࠆޔߕ߹ޕജߣ ߒߡ⸠✵࠺࠲ {(x1,y1 ),...,(xm ,ym )}ࠍฃߌขࠆޔߢߎߎޕฦ xiߪ৻ቯߩⓨ㑆 X ߦ ዻߒߡ߅ࠅߚ߹ޔฦ yiߪ৻ቯߩࡌ࡞㓸ว Y ߦዻߒߡࠆࠆߥߣࠬࡌޔߡߒߘޕቇ ⠌ᯏ᪾㧔BaseLearner㧕ࠍ߮ߔ࠙ࡦ࠼ࠍ T ࿁➅ࠅߔ㧔 t 1,...T 㧕ࠗࡐߢߎߎޕ ࡦ࠻ߣߥࠆࠕࠗ࠺ࠖࠕߪ࠲࠺✵⸠ޔߦቯ⟵ߐࠇߚ⏕₸ಽᏓ㧔߹ߚߪ㊀ߺ㧕ߦࠃࠆ ࠨࡦࡊࡦࠣࠍ↪ࠆߣ⸒߁ߎߣߢࠆ ࠼ࡦ࠙ޕtߦ߅ߌࠆߎߩಽᏓߦࠃࠆ i ߩ㊀ߺࠍ D t(i)ߣᦠߊߩࠄࠇߎޕ㊀ߺߩೋᦼ୯ߪߔߴߡ╬ߒߊ⸳ቯߔࠆ߇ޔฦ࠙ ࡦ࠼ߦ߅ߡ⻢ߞߡ੍᷹ߐࠇߚߩ㊀ߺ߇Ⴧ߿ߐࠇޔBaseLearner ߇ࠃࠅ㔍ߒ ߦ㓸ਛߒߡቇ⠌ߔࠆࠃ߁ߦߥߞߡߊޕ AdaBoost ߩߥ߆ߢߩ BaseLearner ߩ߈ߪ₸⏕ޔಽᏓ D tߦኻߒߡㆡߒߚᒙ⺑. ht: X o Y .ࠍߟߌޔജߔࠆߎߣߦࠆߢߎߎޕᒙ⺑ htߩ㊀ⷐᐲߪ ޔD tߦࠃࠆ⺋ ࠅ⏕₸. Ht. ¦D. (i). t. i:ht(xi)z yi. ߦࠃࠅ᷹ࠄࠇࠆ ߩߢߎߎޕD tߪޔBaseLearner ߦቇ⠌ߐߖࠆ㓙↪ߒߚߩಽᏓ ߢࠆߎߣߦᵈᗧߔࠆޕ. 11.
(18) The algorithm AdaBoost Input:. m ߩ⸠✵࠺࠲ {(x1,y1 ),...,(xm ,ym )} ࡌ࡞㓸วߪ ޔyi Y. {1,...,k}ߣߔࠆޕ. ࡌࠬߣߥࠆቇ⠌ᯏ᪾㧦 BaseLearner ࠻ࠗࠕ࡞࿁ᢙ T ߩ⸳ቯ Step1: D 1 (i) 1/m ߦࠃߞߡฦߦኻߒߡ㊀ߺࠍೋᦼൻ Step2: t 1,2,...,T ߦኻߒߡ. 1. ⏕₸ಽᏓ D tࠍ↪ߡ BaseLearner ߦࠃࠆቇ⠌ 2. ⺑ ht: X o Y .ࠍᓧࠆޕ 3.. ¦D. D tߦࠃࠆ⺋ࠅ⏕₸ ht: H t. (i).ࠍ⸘▚ߔࠆޕ. t. i:ht(x)z yi. ߽ߒ H t ! 1/2 ߥࠄ߫ ޔT 4.. ⺑ߩ㊀ⷐᐲ E t. 5.. ㊀ߺߩᦝᣂ D t: D t1 (i). t 1ߣߒߡᦨೋߩࠬ࠹࠶ࡊߦᚯࠆޕ. H t /(1 H t).ࠍ⸘▚ޕ D t(i) E t ifht(xi) yi u® Z t ¯ 1 otherw ise. Z tߪޔᱜⷙൻቯᢙޕ Step3: ᦨ⚳⺑. hfin (x) arg m ax yY. ¦. t:ht(x) y. log. 1. Et. .. ࿑ 3㧚㧝㧦AdaBoost ࠕ࡞ࠧ࠭ࡓ ᒙ⺑ ht߇ᓧࠄࠇࠆߚ߮ߦ࿑ 2.1 ߦ␜ߐࠇࠆࠃ߁ߦ E tࠍ⸳ቯߔࠆޔߪࠇߎޕᒙ⺑. htߩ㊀ⷐᐲࠍ␜ߒߡࠆ ߒ߽ޔߢߎߎޕH t d 1/2ߢࠇ߫ E t d 1ߣߥࠅ⥸৻ޔᕈࠍᄬࠊ ߕߦቯߔࠆߎߣ߇᧪ࠆޔߚ߹ޕH t߇ዊߐߌࠇ߫ዊߐ߶ߤ ޔE tߪዊߐߊߥࠆޕᰴ ߦ₸⏕ޔಽᏓ D tߩᦝᣂ߇࿑ ࿑㧟㧚㧝ߦ␜ߔࠃ߁ߥⷙೣߢⴕࠊࠇࠆᦝߩߎޕᣂⷙೣߦࠃࠅޔ ᒙ⺑ߦࠃߞߡᱜߒߊಽ㘃ߐࠇߚߦኻߒ㊀ߺࠍᷫࠄߒޔ㑆㆑ߞߡಽ㘃ߐࠇߚ ߦኻߒ㊀ߺࠍᷫࠄߒߡߊޔߦ߁ࠃߩߎޕ㊀ߺߪ㔍ߒߦኻߒߡ㓸ਛߒߡߊޕ ᦨ⚳⺑ hfin ߪߡߒ߁ߎޔᓧࠄࠇߚ T ߩᒙ⺑ ht ߣߘߩ㊀ⷐᐲ E t ࠍ↪ߡޔ㊀ߺ. ઃ߈ᄙᢙߣߒߡᓧࠄࠇࠆޕ. 12.
(19) 3. 3 AdaBoost ߩ․ᓽ ߎߩࠕ࡞ࠧ࠭ࡓߪޔታߦㆡ↪ߔࠆߦߚߞߡᄙߊߩఝࠇߚᕈ⾰ࠍᜬߞߡࠆޕ ߚߣ߃߫ޔන⚐ߥࠕ࡞ࠧ࠭ࡓߢࠅ◲߇ࡓࠣࡠࡊޔනߢࠅ߽▚⸘ޔല₸⊛ߢ ࠆ࡞ࠕࠗ࠻ޕ࿁ᢙ T ࠍߩߙߌ߫⺞ᢛߩᔅⷐߥࡄࡔ࠲߽ߥޔߚ߹ޕBaseLearner ߦߚߔࠆߒ⍮⼂ࠍᔅⷐߣߖߕޔ⺑ࠍ⊒ߔࠆࠕ࡞ࠧ࠭ࡓߢࠇ߫છᗧߩ‛ ߣ⚵ߺวࠊߖࠆߎߣ߇᧪ࠆޔ߽߆ߒޕBaseLearner ߩ੍᷹♖ᐲߣ⸠✵࠺࠲ߩࠨࠗ ࠭ߦ㑐ߔࠆ✭߿߆ߥ᧦ઙߩਅߢޔℂ⺰⊛ߥᕈ⢻⸽߇ਈ߃ࠄࠇߡࠆޔߪࠇߎޕቇ⠌ 㗔ၞోߦ߅ߡ㜞♖ᐲࠍ㆐ᚑߔࠆቇ⠌ࠕ࡞ࠧ࠭ࡓࠍ⊒ߔࠆઍࠊࠅߦࡓ࠳ࡦޔ ੍᷹ࠃࠅ߶ࠎߩዋߒ⦟♖ᐲࠍᜬߟࠕ࡞ࠧ࠭ࡓࠍ⊒ߔࠇ߫⦟ߎߣߦߥࠆ߆ߒޕ ߒ߇࠲࠺✵⸠ޔਇ⿷ߒߒߡࠆ႐ว߿ⶄ㔀ߔ߉ࠆᒙ⺑߇ᓧࠄࠇࠆ႐วߪߚ߹ޔ BaseLearner ߩ੍᷹♖ᐲ߇ૐߔ߉ࠆ႐วߥߤߪߪࠣࡦࠖ࠹ࠬࡉޔℂ⺰⊛ߦലᨐ⊛ߢ ߪߥߣߐࠇߡࠆޕ AdaBoost ߪ᭽⎇ߥޘⓥ⠪ߦࠃߞߡታ㛎⊛ߦ⹏ଔߐࠇߡࠆޕFreund ߣ Schapire 㨇2㨉ߪޔAdaBoost ߇ቯᧁࠍ↪ߚಽ㘃ࠪࠬ࠹ࡓߢࠆ C4.5ޔන৻ࡁ࠼߆ࠄߥ ࠆቯᧁ㧔ቯᩣ㧕ࠍ↪ߚࠕ࡞ࠧ࠭ࡓߦ AdaBoost ࠍㆡ↪ߒߚታ㛎ߢޔAdaBoost ࠍㆡ↪ߒߚቯᩣߪޔC4.5 ߣห╬ߩᕈ⢻ࠍᜬߟߎߣࠍታ⸽ߒߚޔߚ߹ޕC4.5 ߣ⚵ߺ วࠊߖࠆߎߣߦࠃࠅ♖ᐲߩะ߇ࠄࠇࠆߎߣࠍ␜ߒߡࠆޕ ߐࠄߦޔQuinlan[1]ߢߪޔC4.5 ߦߚߒߡ AdaBoostޔBagging ߣ߁㧞ߟߩࠕࡦ ࠨࡦࡉ࡞ቇ⠌ࠕ࡞ࠧ࠭ࡓࠍૐⷐߒߡታ㛎ࠍⴕߞߡࠆޕਔᣇߩࠕ࡞ࠧ࠭ࡓߪޔ ߦ C4.5 ߦኻߒߡലߢࠆߣ߁⚿ᨐߣߥߞߡࠆޔߦࠄߐޕAdaBoost ߪޔBagging ࠃࠅ߽ C4.5 ߦኻߒߡലߢࠆߣ߁⚿ᨐ߽␜ߐࠇߡࠆߩߎޕታ㛎ߢߪࠬࡉޔ ࠹ࠖࡦࠣߩ࠻ࠗࠕ࡞࿁ᢙߪޔ10 ࿁ߣߥߞߡࠆޕ ઁߦ߽᭽ߥޘቇ⠌ࠪࠬ࠹ࡓߦኻߔࠆታ㛎ߢࡉࠬ࠹ࠖࡦࠣ߇ലߢࠆߣ߁ႎ ๔ߐࠇߡࠆ߇ࠣࡦࠖ࠹ࠬࡉޔߒ߆ߒޕലߢߥࠤࠬߩႎ๔߽ߞߚ߃ߣߚޕ ߫ Freund ߣ Schapire[5]ߦࠃࠆ OCR ߩታ㛎ߦ߅ߡߩ࠲࠺ޔਛߦᄖ߇㕖Ᏹ ߦᄙߊሽߔࠆߣ߈ޔ࿎㔍ߥ߳㊀ߺࠍ㓸ਛߐߖࠆߎߣ߇ቇ⠌ߦ㊀ᄢߥᖡᓇ㗀ࠍ␜ ߔߎߣ߇ࠆߣ⸒߁‛ߢࠆߦࠬࠤߥ߁ࠃߩߎޕኻᔕߔࠆߚߦᡷ⦟ߐࠇߚࠕ࡞ࠧ ࠭ࡓ߽ߊߟ߆⊒ߐࠇߡࠆޕ. 13.
(20) ߹ߚߩࠣࡦࠖ࠹ࠬࡉޔᄌࠊߞߚㆡ↪߽ߞߚޔߪࠇߘޕᄖߩ⊒ߢࠆޕ ࡉࠬ࠹ࠖࡦࠣߪ੍ߚߞ⻢ޔ᷹ࠍߒߚߦኻߒߡቇ⠌ࠍ㓸ਛߐߖߡߊߣ߁ㇱಽ ࠍ↪ߒߚ‛ߢࠆޕ. 14.
(21) ╙ 4 ┨ AdaBoost ߩᡷ⦟ 4. 1 ᡷ⦟ߩឭ᩺ ᧄ⺰ᢥߢߪޔAdaBoost ߩേࠍࠃࠅᷓߊಽᨆߔࠆߚߦߩࡓ࠭ࠧ࡞ࠕޔᡷ⦟ࠍ ⴕޔAdaBoost ߣߩᲧセࠍⴕ߁ޕAdaBoost ߩᡷ⦟ߪޔ᭽⎇ߥޘⓥ߇ߥߐࠇߡࠆޕ ߚߣ߃߫ޔQuinlan[1]ߢߪ⚳ᦨޔ⺑ߩᣇᴺߩᡷ⦟ࠍⴕߞߡࠆߩߎޕᡷ⦟ߪߎߘޔ ߘߎ⦟⚿ᨐ߇ߡߚޔߚ߹ޕ೨┨ߢ߽⚫ߒߚ߇㑆㆑ߞߚߦቇ⠌߇㓸ਛߒߔ ߉ߥࠃ߁ߦޔ㊀ߺᦝᣂߩᣇᴺࠍᡷ⦟ߒߚ‛ߥߤ߇ࠆޕ ᧄ⺰ᢥߢߪޔ㊀ߺߩᦝᣂᣇᴺߩᡷ⦟ࠍⴕߞߚޕAdaBoost ߩ㊀ߺᦝᣂߩᴺೣߪߤޔ ߩࠃ߁ߥߦኻߒߡ߽หߓᣇᴺߢⴕࠊࠇࠆ࠲࠺✵⸠ޔߒ߆ߒޕౝߩࡌ࡞㓸ว. yi Y. {1,...,k}߇ሽߔࠆഀวߪ৻ቯߢߪߥߊޔሽߔࠆഀว߇ዋߥࡌ࡞ߪޔഀ. วߩᄙࡌ࡞ࠃࠅ੍߽᷹ߔࠆߣ߈ߩ㊀ⷐᐲߪޔᄢ߈ߣ⠨߃ࠆߎߣ߇᧪ࠆߣߚޕ ߃߫࠲࠺✵⸠ޔᢙ 10 㧘ࡌ࡞㓸ว yi Y. {1,1}ߩ⸠✵࠺࠲߇ࠆߣߔࠆߣޔ. ࡌ࡞ 1߇ 7 ࠆߣߔࠆߣ ࡞ࡌޔ1ߪ 3 ߒ߆ሽߒߥߎߣߦߥࠆ߈ߣߩߎޕ ࡌ࡞ 1ߪ ࡞ࡌޔ1ࠃࠅ੍᷹ߔࠆߣ߈ߩ㊀ⷐᐲߪᄢ߈ߣ⠨߃ࠄࠇࠆࠍߣߎߩߎޕ AdaBoost ߩ㊀ߺᦝᣂߩᣇᴺߦขࠅࠇߡቇ⠌ࠍⴕ߁ࠕ࡞ࠧ࠭ࡓߩᡷ⦟ࠍⴕߞߚޕ ߎࠇߦࠃߞߡ AdaBoost ߩᕈ⢻߇ߤߩࠃ߁ߦᄌൻߔࠆ߆ࠍᲧセߔࠆߎߣߦߒߚޕ ߥ߅ޔ ᡷ⦟ߒߚࠕ࡞ࠧ࠭ࡓߪޔ㧞⒳㘃ᚑߒߚ⺰ᧄޕᢥߢߪ࡞࠺ࡕޔ㧝ࠍ AdaBoost_M1 ߣ߱ߎߣߦߒ࡞࠺ࡕޔ㧞ࠍ AdaBoost_M2 ߣ߱ߎߣߦߔࠆޕએਅߩ▵ߢࠕ࡞ࠧ ࠭ࡓߩ⺑ࠍⴕ߁ޕ. 15.
(22) 4. 2 ࡕ࠺࡞㧝㧔AdaBoost_M1㧕 ߎߩࠕ࡞ࠧ࠭ࡓߪޔAdaBoost ߩቇ⠌ࡊࡠࠬߦ⸠✵࠺࠲ౝߩࡌ࡞ߩഀวߦ ࠃࠆ㊀ⷐᐲࠍട߃ߡቇ⠌ߐߖࠆߚߦᰴߩࠃ߁ߥᬺࠍട߃ߚޕ ߹ߕޔቇ⠌ࠍߔࠆ㓙ߩḰߩߣ߈ߦࡌ࡞㓸วߩഀว. D (yi) ߚߛߒ y Y. {1,...,k}ߣߔࠆޕ. ࠍḰߔࠆޕ BaseLearner ߦࠃࠆቇ⠌ޔ₸ࠅ⺋ޔ⺑ߩ㔛ⷐߤߩ⸘▚ߪޔㅢᏱߩ AdaBoost ߣห ߓᣇᴺߢⴕ߁ޕ ᰴߦޔ࿑ ࿑㧠㧚㧝ߩߣ߅ࠅ㊀ߺߩᦝᣂࠍⴕ߁ޕᱜߒ੍᷹ࠍߒߚ‛ߦኻߒ E t ߣߘߩ ߩታ㓙ߩࡌ࡞ߩഀว D (yi)ࠍࠍដߌࠆޔߡߞࠃޕᱜߒ੍᷹ࠍߒߚ㊀ߺߪᷫዋߔ ࠆߩ࡞ࡌޔߦࠄߐޕഀวࠍដߌࠆߎߣߦࠃߞߡ࡞ࡌޔഀวߩዊߐ‛ߩ㊀ߺߪ ࡌ࡞ߩഀวߩᄢ߈‛ߩ㊀ߺࠃࠅߐࠄߦዊߐߊߥࠆߩߣޕᬺߪޔㅢᏱߩ AdaBoost ߣห᭽ߢࠆޕ. 16.
(23) The algorithm Modified AdaBoost 1 (AdaBoost_M1) Input:. m ߩ⸠✵࠺࠲ {(x1,y1 ),...,(xm ,ym )} ࡌ࡞㓸วߪ ޔyi Y ࡌ࡞㓸วߩሽߔࠆഀว㧦 D (yi) ࡌࠬߣߥࠆቇ⠌ᯏ᪾㧦 BaseLearner ࠻ࠗࠕ࡞࿁ᢙ T ߩ⸳ቯ. Step1: D 1 (i) 1/m ߦࠃߞߡฦߦኻߒߡ㊀ߺࠍೋᦼൻ Step2: t 1,2,...,T ߦኻߒߡ. 1. ⏕₸ಽᏓ D tࠍ↪ߡ BaseLearner ߦࠃࠆቇ⠌ 2. ⺑ ht: X o Y .ࠍᓧࠆޕ 3.. ¦D. D tߦࠃࠆ⺋ࠅ⏕₸ ht: H t. (i).ࠍ⸘▚ߔࠆޕ. t. i:ht(x)z yi. ߽ߒ H t ! 1/2 ߥࠄ߫ ޔT 4.. ⺑ߩ㊀ⷐᐲ E t. 5.. ㊀ߺߩᦝᣂ D t: D t1(i). t 1ߣߒߡᦨೋߩࠬ࠹࠶ࡊߦᚯࠆޕ. H t /(1 H t).ࠍ⸘▚ޕ D t(i) E t u D (yi) ifht(xi) yi u® otherw ise Zt ¯ 1. Z tߪޔᱜⷙൻቯᢙޕ Step3: ᦨ⚳⺑. hfin (x) arg m ax yY. ¦. t:ht(x) y. log. 1. Et. .. ࿑㧠㧚㧝㧦AdaBoost_M1 ࠕ࡞ࠧ࠭ࡓ. 17. {1,...,k}ߣߔࠆޕ.
(24) 4. 3 ࡕ࠺࡞㧞㧔AdaBoost_M2㧕 ߎߩࠕ࡞ࠧ࠭ࡓߪޔAdaBoost ߩቇ⠌ࡊࡠࠬߦ⸠✵࠺࠲ౝߩࡌ࡞ߩഀวߦ ࠃࠆ㊀ⷐᐲࠍട߃ߡቇ⠌ߐߖࠆߚߦࡌ࡞ࠍࠊߌߡ㊀ߺߩᦝᣂࠍⴕߞߚࠫࡔࠗޕ ߪޔAdaBoost ߩቇ⠌ߔࠆࠬ࠶࠹ࡊߩ৻ߟߩ࠲ࡦߢ BaseLearner ߩቇ⠌⚳ੌᓟ⸠ޔ ✵࠺࠲ࠍࡌ࡞ߏߣߦࠊߌ㊀ߺߩᦝᣂࠍⴕ߹ߚోޔࠍ⚵ߺวࠊߖࠆߣߞߚᣇ ᴺߢࠆޕ ࿑㧠㧚㧞ࠃࠅ Step2 ߩ㧞߹ߢߪㅢᏱߩ AdaBoost ߣหߓߢࠆޔߦ߉ߟޕ ᚻ㗅ߪޔ࿑ ࡌ࡞ߩ㊀ߺࠍ↪ߚ⺋ࠅ⏕₸. H (y) ߚߛߒ y Y. {1,...,k}ߣߔࠆޕ. ࠍ⸘▚ߒߔࠆ߽ߢߟ৻߇₸ࠅ⺋ߢߎߎޕ㧝/㧞એߦߥߞߚ႐วቇ⠌ࠍ⚳ੌߐߖࠆߘޕ ߒߡߩߘޔ⺑ߩࡌ࡞ߩ㊀ⷐᐲ. E (y) ߚߛߒ y Y. {1,...,k}ߣߔࠆޕ. ࠍ⸘▚ߔࠆߩࠄࠇߎߦ߉ߟޕᐔဋ୯ࠍ᳞ోߩ⺑ߦኻߔࠆ㊀ⷐᐲࠍ᳞ࠆޕ㊀ߺ ߩᦝᣂᤨߪ࡞ࡌޔߩߘߩ⺑ߦኻߔࠆ㊀ⷐᐲࠍ↪ߒߡ㊀ߺߩᦝᣂࠍⴕ߁ᦨޕᓟ ߦ㊀ߺߩᱜⷙൻࠍⴕ߁ޕ ᦨ⚳⺑ߢ↪ߔࠆ⺑ߩ㊀ⷐᐲߪ Step2 ߩ㧢ߢ⸘▚ߒߚ‛ࠍ↪ߒᄙᢙࠍ ⴕ߁ޕ. 18.
(25) The algorithm Modified AdaBoost 1 (AdaBoost_M2) Input:. m ߩ⸠✵࠺࠲ {(x1,y1 ),...,(xm ,ym )} ࡌ࡞㓸วߪ ޔyi Y. {1,...,k}ߣߔࠆޕ. ࡌࠬߣߥࠆቇ⠌ᯏ᪾㧦 BaseLearner ࠻ࠗࠕ࡞࿁ᢙ T ߩ⸳ቯ Step1: D 1 (i) 1/m ߦࠃߞߡฦߦኻߒߡ㊀ߺࠍೋᦼൻ Step2: t 1,2,...,T ߦኻߒߡ. 1. ⏕₸ಽᏓ D tࠍ↪ߡ BaseLearner ߦࠃࠆቇ⠌ 2. ⺑ ht: X o Y .ࠍᓧࠆޕ 3.. D tߦࠃࠆࡌ࡞⺋ࠅ⏕₸ H (1),...H (k)ࠍ⸘▚ߔࠆޕ ߽ߒ H (y)! 1/2㧔ߚߛߒ y Y. {1,...,k}㧕ߥࠄ߫ޔT. 4.. ࡌ࡞ߩ⺑ߩ㊀ⷐᐲࠍ⸘▚ E (y). 5.. ࡌ࡞ߩ⺋ࠅ₸ߩᐔဋࠍ⸘▚. t 1ߣߒߡᦨೋߩࠬ࠹࠶ࡊߦᚯࠆޕ. H (y)/(1 H (y)) ߚߛߒ y Y {1,...,k} k. Ht. ¦ H (Y ) k Y 1. 6.. ⺑ߩ㊀ⷐᐲ E t. H t /(1 H t).ࠍ⸘▚ޕ. 7.. ࡌ࡞ߩ㊀ߺᦝᣂ D t: D 't1 (i) D t(i)u ®. 8.. ᱜⷙൻࠍⴕ߁ޕ. E (yi) ifht(xi) yi otherw ise ¯ 1. Step3: ᦨ⚳⺑. hfin (x) arg m ax yY. ¦. t:ht(x) y. log. 1. Et. .. ࿑㧠㧚㧞㧦AdaBoost_M2 ࠕ࡞ࠧ࠭ࡓ. 19.
(26) ╙ 5 ┨ ࠪࠬ࠹ࡓߩⷐ 5. 1 ⷐ ᧄ⎇ⓥߢߪߣࠣࡦࠖ࠹ࠬࡉޔಽ㘃ࠪࠬ࠹ࡓߩᱜ⏕ߐߣߩ㑐ଥࠍ⺞ߴࠆߚߦએਅ ߩࠃ߁ߥࠪࠬ࠹ࡓࠍᚑߒߚޕ x C4.5 ߩᨑಿࠅ೨ߩᧁߦ AdaBoost ࠍㆡ↪ߒߚ̌un-pruned BC4.5̍ x C4.5 ߩᨑಿࠅᓟߩᧁߦ AdaBoost ࠍㆡ↪ߒߚ̌pruned BC4.5̍ x C4.5 ߩᨑಿࠅ೨ߩᧁߦ AdaBoost_M1 ࠍㆡ↪ߒߚ̌un-pruned BC4.5̍ x C4.5 ߩᨑಿࠅᓟߩᧁߦ AdaBoost_M1 ࠍㆡ↪ߒߚ̌pruned BC4.5̍ x C4.5 ߩᨑಿࠅ೨ߩᧁߦ AdaBoost_M2 ࠍㆡ↪ߒߚ̌un-pruned BC4.5̍ x C4.5 ߩᨑಿࠅᓟߩᧁߦ AdaBoost_M2 ࠍㆡ↪ߒߚ̌pruned BC4.5̍ x 10-times 10-fold Cross-validation ࠪࠬ࠹ࡓ ̌CV̍ ⸥ߩࠃ߁ߦ㧢⒳㘃ߩಽ㘃ࠪࠬ࠹ࡓޔಽ㘃ࠪࠬ࠹ࡓࠍ⹏ଔߔࠆߚߩࠪࠬ࠹ࡓ⸘ߩޔ 㧣⒳㘃ߩࠪࠬ࠹ࡓࠍᚑߒߚ̌ߪࡓࠣࡠࡊߩࠄࠇߎޔߚ߹ޕMicrosoft Visual C++ 6.0̍ߢᚑߒߚޔߦࠄߐޕC4.5 ߪޔ ̌http://www.cse.unsw.edu.au/~quinlan/̍ߩ ̌C4.5 Releace8̍ߩ࠰ࠬࠦ࠼ࠍ↪ߒߚޔ߅ߥޕಽ㘃ࠪࠬ࠹ࡓߣ CV ࠪࠬ࠹ࡓ ߩᔕ╵ߪޕߚߞⴕߡߒ╬ࠍ࡞ࠗࠔࡈޔ 㧡㧚㧞ߦߡಽ㘃ࠪࠬ࠹ࡓޔ㧡㧚㧟ߦߡ CV ࠪࠬ࠹ࡓߦߟߡߩߒ⺑ࠍߔࠆޕ. 20.
(27) 5. 2 ಽ㘃ࠪࠬ࠹ࡓ ᚑߒߚಽ㘃ࠪࠬ࠹ࡓߪޔ㧟⒳㘃ߩ Boosting ࠕ࡞ࠧ࠭ࡓࠍߘࠇߙࠇ C4.5 ߩᨑಿ ࠅ೨ᓟߩᧁߦㆡ↪ߒߚಽ㘃ࠪࠬ࠹ࡓߢࠆޕ㧢⒳㘃ߩಽ㘃ࠪࠬ࠹ࡓࠍᚑߒߚ߇ޔၮ ᧄ⊛ߦߪߔߴߡหߓࠃ߁ߥേࠍߔࠆޔߪߢࡓ࠭ࠧ࡞ࠕࠣࡦࠖ࠹ࠬࡉޕ BaseLearner ߦࠃࠆቇ⠌ߩ⚳ੌᓟߦߩߘޔ⺑ߩ㊀ߺࠍ↪ߚࠛ₸߇ 0㧚㧡ࠍ ߃ߚ႐วߪޔቇ⠌ࠍ߿ࠅ⋥ߔߎߣߦߥߞߡࠆ߇ C4.5 ߩᕈ⾰หߓ࠺࠲࠶࠻߆ ࠄߪޔ㆑߁ᒻߩ࡞࡞ߪᚑߢ߈ߥߚߘߎߢࡉࠬ࠹ࠖࡦࠣߩቇ⠌ࠍࠬ࠻࠶ࡊߐ ߖߚޕ ߟ߉ߦ C4.5 ߦࡉࠬ࠹ࠖࡦࠣࠍㆡ↪ߔࠆߦߚߩᄌᦝὐࠍએਅߦ␜ߔޕ x C4.5 ߩᄌᦝὐ Visual C++ߢࠦࡦࡄࠗ࡞ߢ߈ࠆࠃ߁ߦߒߚޕ ജࠍᄌᦝߒߚޕ 㧔એਅߢ⺑㧕 C4.5 ߩቇ⠌ߦᔅⷐߥࡄࡔ࠲ߪ࠺ࠖࡈࠜ࡞࠻୯ࠍ↪ߔࠆޕ ቯᧁᚑߣᧁߩᨑಿࠅએᄖߩᯏ⢻ࠍ㒰ߒߚޕ ᨑಿࠅήߒߩቇ⠌߽ⴕ߃ࠆࠃ߁ߦߒߚޕ ജࠍ࠹ࠬ࠻࠺࠲ߩߺߩࠛ₸ߛߌߦߒߚޕ x ࡉࠬ࠹ࠖࡦࠣᯏ⢻ߩߚߩㅊടὐ ࡉࠬ࠹ࠖࡦࠣߦᔅⷐߥࡄࡔ࠲ޔ㑐ᢙߩㅊട ⹜ⴕ࿁ᢙࠍ⸳ቯน⢻ߦߒߚޕ ㊀ߺࠍ↪ߚࠛ₸߇㧜㧚㧡ࠍ߃ߚᤨὐߢቇ⠌ࠍ⚳ੌߐߖࠆޕ ࡉࠬ࠹ࠖࡦࠣᓟߩ࡞࡞ߦኻߔࠆ࠹ࠬ࠻࠺࠲ߩࠛⴕ⹜ޔ࿁ᢙࠍജ. ߟ߉ߦޔಽ㘃ࠪࠬ࠹ࡓߩࠛ₸ജ߹ߢߩᵹࠇࠍㅀߴࠆޕ ಽ㘃ࠪࠬ࠹ࡓߪ࡞ࠗࠔࡈߕ߹ޔฬޔቯᧁߩ⒳㘃ⴕ⹜ߩࠣࡦࠖ࠹ࠬࡉޔ࿁ᢙߩ㗅 ߦജࠍฃߌขࠆ࡞ࠗࠔࡈޕฬࠍ̌DF̌ߣߒߚߣ߈ߩജߪޔ C:㩯BC45 DF 㧝 㧝㧜 ߩࠃ߁ߦߥࠆޔ߅ߥޕቯᧁߩ⒳㘃ߪޔ㧝㧚Un-pruned treeޔ㧞㧚Pruned tree ߣߥ ߞߡࠆޕജᓟࡊࡠࠣࡓߪޔ ̌DF.names̍̌ޔDF.data̍ߣ߁㧞⒳㘃ߩࡈࠔࠗ࡞. 21.
(28) ࠍ⺒ߺㄟޕᰴߦࡄࡔ࠲ߩೋᦼൻࠍⴕ߁ޕ ߘࠇ߆ࠄએਅߩᬺࠍ⹜ⴕ࿁ᢙ T ࿁ߛߌ➅ࠅߔޕ 1. C4.5 ߦࠃࠆቯᧁߩ↢ᚑ 2. ㊀ߺࠍ↪ߚࠛ₸ࠍ⸘▚㧔߽ߒࠛ₸߇㧜㧚㧡એߩ႐วߪቇ⠌⚳ੌ㧕 3. ㊀ߺߩᦝᣂ 4. ㊀ߺߩᱜⷙൻ ቇ⠌⚳ੌᓟޔ ̌DF.test̍ࡈࠔࠗ࡞ࠍ⺒ߺㄟߩࠣࡦࠖ࠹ࠬࡉޕቇ⠌ߩ㧝࿁⋡ߪߔߴ ߡ AdaBoost ߇ㆡ↪ߐࠇߡߥಽ㘃ࠪࠬ࠹ࡓ㧔ߟ߹ࠅ C4.5㧕ߩ⚿ᨐߣߥߞߡࠆߩ ߢࠍ₸ࠛߩߘޔജߔࠆⴕ⹜ޔߡߒߘޕ࿁ᢙಽߩᚑߒߚ࡞࡞ࠍ⚵ߺวࠊߖߡ ᦨ⚳⺑ߣߒߡߘࠇߩࠛ₸ࠍജߔࠆޕജߪޔAdaBoost ㆡ↪೨ߣㆡ↪ᓟߩਔ ᣇߩࠛ₸ߣࡉࠬ࠹ࠖࡦࠣߩ⹜ⴕ࿁ᢙߩ㧟⒳㘃ߣߥߞߡࠆޕ ᚑߒߚಽ㘃ࠪࠬ࠹ࡓߩࡈࡠ࠴ࡖ࠻ࠍ࿑㧡㧚㧝ߦ␜ߔޕ. 22.
(29) “DF.name”ߣ”DF.data”ࡈࠔࠗ࡞ࠍ⺒ߺㄟ. i=1. C4.5 ߦࠃࠆ㊀ߺઃ߈ቇ⠌. H d 0.5. F. i=i+i. T ㊀ߺߩᦝᣂ. T ti. F. T. “DF.test”ࡈࠔࠗ࡞ߣ⺒ߺㄟ. Voting ฦቇ⠌߆ࠄᓧߚ⺑ࠍ↪ߡ㊀ߺ ࠍߞߚᄙᢙߦࠃࠅಽ㘃ࠍⴕޔ ࠛ₸ࠍ⸘▚ߔࠆޕ. ࿑ 5㧚1 ಽ㘃ࠪࠬ࠹ࡓߩ◲නߥࡈࡠ࠴ࡖ࠻ ߎߎߢߩ H ߪ C4㧚5 ߩ㊀ߺࠍ↪ߚࠛ₸ ޔT ߪ⹜ⴕ࿁ᢙࠍߒߡࠆޕ. 23.
(30) 5. 3 ⹏ଔࠪࠬ࠹ࡓ CV ࠪࠬ࠹ࡓߪޔಽ㘃ࠪࠬ࠹ࡓߣࡈࠔࠗ࡞ࠍ↪ߒᔕ╵ࠍⴕ߁ࠪࠬ࠹ࡓߦߒߚߎޕ ߩࠪࠬ࠹ࡓߩേߪޔ೨▵ห᭽࠺࠲࠶࠻ฬࠍ̌DF̌ߣߔࠆߣ̌ߕ߹ޔDF.all̍ࡈ ࠔࠗ࡞ࠍ⺒ߺㄟߡߒߘޕએਅߩᬺࠍ 10 ࿁ⴕ߁ޕ 1. ࠺࠲ࠍࠪࡖ࠶ࡈ࡞ߔࠆޕ 2. ࠺࠲ࠍ 10 ߦಽߌࠆޕ 3. ᰴߩᬺࠍಽߌࠄࠇߚ࠺࠲ߘࠇߙࠇߦኻߒߡⴕ߁㧔⸘ 10 ࿁㧕 i.. 1 ߟࠍ̌DF.test̍ࡈࠔࠗ࡞ߦᦠ߈ㄟޕ. ii.. ᱷࠅࠍ̌DF.data̍ࡈࠔࠗ࡞ߦᦠ߈ㄟޕ. iii.. ᚑߒߚ 6 ⒳㘃ߩಽ㘃ࠪࠬ࠹ࡓߦࠃࠆቇ⠌ࠍⴕ߁ޕ. iv.. ಽ㘃ࠪࠬ࠹ࡓߩജߒߚࠛ₸ࠍฃߌขࠆޕ. 4. ࠛ₸ߩᐔဋࠍ⸘▚ߔࠆޕ ߎࠇࠄߩᬺᓟߐࠄߦࠛ₸ߩᐔဋࠍ⸘▚ߒߘࠇࠍജߔࠆࠪߩߎޔߡߞ߇ߚߒޕ ࠬ࠹ࡓߪޔಽ㘃ࠪࠬ࠹ࡓߦࠃࠆಽ㘃 ⸘ޔ100 ࿁ߩࠛ₸ߩᐔဋࠍ᳞ࠆߎߣߦߥࠆޕ CV ࠪࠬ࠹ࡓߩ◲නߥࡈࡠ࠴ࡖ࠻ࠍ࿑ 6㧚2 ߦ␜ߔޕ. 24.
(31) DF.all ࡈࠔࠗ࡞ߩ⺒ㄟ. i=1. ࠺࠲ࠍࠪࡖ࠶ࡈ࡞ߔࠆޕ. ࠺࠲ࠍ 10 ಽഀߔࠆޕ S[1]S[10]. j=1. i=i+1. S[j]ࠍ̌DF.test̍ߦᦠㄟߺޔ ᱷࠅࠍ̌DF.data̍ߦᦠㄟ. 6 ⒳㘃ߩಽ㘃ࠪࠬ࠹ࡓߦࠃࠆ ࠛ₸ߩജ. jt 10. it 10. ࠛ₸ߩᐔဋࠍജ. 25. j=j=1.
(32) ࿑ 5㧚2 ⹏ଔࠪࠬ࠹ࡓߩ◲නߥࡈࡠ࠴ࡖ࠻. 26.
(33) ╙ 6 ┨ ታ㛎 6. 1 ࠺࠲࠶࠻ ታ㛎ߩ⋡⊛ߪಽ㘃ࠪࠬ࠹ࡓߩᱜ⏕ߐࠍ᷹ࠆߎߣߢࠆޕಽ㘃ࠪࠬ࠹ࡓߪ࠺࠲࠶ ࠻ߩ㆑ߦࠃߞߡ⇣ߥߞߚᱜ⏕ߐࠍߖࠆޕᓥߞߡ࠺࠲࠶࠻ߩᢙߪᄙ߶ߤࠃޕ ߘߒߡ⒳㘃߽⼾ንߢࠆᔅⷐ߇ࠆ⹏ࠍࡓ࠹ࠬࠪޔߚ߹ޕଔߔࠆߦߪᄸᛮߥ࠺࠲ ࠶࠻ߪᔅⷐߥߊߊࠃޔ⍮ࠄࠇߚ࠺࠲࠶࠻ࠍ↪ࠆߎߣ߇ᅢ߹ߒߣߐࠇߡࠆޕ ߎࠇࠄߩ᧦ઙࠍߔߴߡḩߚߔ Web ߩ࠺࠲ࡌࠬ߇ࠆޕ ߘࠇߪޔ UCI 㧔University of California Irvine㧕ߩࡎࡓࡍࠫ (http://www.ics.uci.edu/mlearn/MLRepository.html) ߦࠅޔቇ⠌ࠪࠬ࠹ࡓߩ⹏ଔ ࠍⴕ߁ኾ㐷ኅߩߚߦ⛽ᜬߐࠇߡࠆޕ ᧄ⎇ⓥߢߪޔ 㨇1㨉ߢ↪ࠄࠇߡࠆ 27 ⒳㘃ߩ࠺࠲࠶࠻ࠍਛᔃߦޔUCI ߩ࠺ ࠲ࡌࠬ߆ࠄᚻߒߚ 30 ⒳㘃ߩ࠺࠲࠶࠻ࠍ↪ߒߚ⎇ᧄޔ߅ߥޕⓥߢᛒ߁ߔߴ ߡߩࠪࠬ࠹ࡓߪ C4㧚5 ߩᒻᑼߦࠃࠆࡈࠔࠗ࡞ߒ߆ฃߌઃߌߥߩߢ࠺࠲࠶࠻ߪ ߎߩᒻᑼߦߥࠆࠃ߁ߦߒߚޔߚ߹ޕᚻߒߚ࠺࠲࠶࠻ߪᰳ៊୯ߩή‛ࠍㆬᛯߒ ߚߦࠄߐޕᚻߒߚ࠺࠲࠶࠻ߪޔዻᕈ୯ߪㅪ⛯୯߆㔌ᢔ୯ߣߥߞߡࠆޔߚ߹ޕ ࠢࠬߪ㔌ᢔ୯ߣߥߞߡࠆޕ 6㧚1 ߦ․ߩ࠻࠶࠲࠺ߩޘᓽࠍߒߒߚޕ. 27.
(34) データセット名 anneal audiology auto breast chess cleve diabetes DNA flare glass heart hepatitis horse- colic hypothyroid ionosphere iris labor- neg letter lymphography segment shuttle sick- euthyroid sonar soybean- large splice vehicle vote waveform- 21 wine zoo. データ数. クラス数. 898 226 205 699 3500 303 768 3186 1066 214 270 155 368 2800 351 150 85 20000 148 2310 58000 3163 208 316 3190 840 435 5000 178 101. 6 24 7 2 2 2 2 3 2 5 2 2 2 2 2 3 2 26 4 7 7 2 2 19 3 4 2 3 3 7. 属性数 連続値 離散値 6 32 70 0 14 12 10 0 0 36 6 7 8 0 0 180 2 8 10 0 13 0 6 14 6 16 7 22 34 0 4 0 8 8 16 0 3 15 19 0 9 0 7 18 59 0 0 35 0 59 18 0 0 16 21 0 13 0 0 18. 㧢㧚㧝 ࠺࠲࠶࠻ߩ․ᓽ. 28.
(35) 6. 2 ታ㛎ߩⷐ ᧄ⎇ⓥߦ߅ߌࠆታ㛎ߪޔ C4.5 ߣ BC4.5 ߣߩᲧセ BC4.5 ߣ BC4.5_M1 ߣߩᲧセ BC4.5 ߣ BC4.5_M2 ߣߩᲧセ. ߎࠇࠄ 3 ⒳㘃ߩታ㛎ࠍⴕߞߚ߅ߥޕฦታ㛎ߣ߽ un-pruned tree ߣ pruned tree ߩᲧ セ߽ⴕߞߡࠆޕ ᧄ⎇ⓥߦ߅ߌࠆࡉࠬ࠹ࠖࡦࠣᚻᴺࠍㆡ↪ߒߚಽ㘃ࠪࠬ࠹ࡓߩࡉࠬ࠹ࠖࡦࠣቇ ⠌ᤨߩ⹜ⴕ࿁ᢙߪޔ 㨇1㨉ߩ⺰ᢥห᭽ 10 ࿁ߣߒߡⴕߞߚޔ߅ߥޕฦታ㛎ߦ߅ߌࠆᲧセ ߪ CV ࠪࠬ࠹ࡓࠍ↪ࠆߎߣߦࠃࠆផቯࠛ₸ࠍ᷹ࠆߎߣߢⴕߞߚޕ. 6. 3 ታ㛎ߩᚻ㗅 ታ㛎ߩᚻ㗅ࠍએਅߦ␜ߔޕ 㧝㧚 ࠺࠲࠶࠻ࠍㆬᛯߔࠆޕ 㧞㧚 ࠺࠲࠶࠻ࠍ CV ࠪࠬ࠹ࡓߦ⺒ߺㄟ߹ߖࠆޕ 㧟㧚 ߔߴߡߩಽ㘃ࠪࠬ࠹ࡓߩࠛ₸ࠍᓧࠆޕ ߎߩᬺࠍߔߴߡߩ࠺࠲࠶࠻ߦኻߒߡⴕߞߚࠍ₸ࠛߩߡߴߔޕജᓟ⸥ߩ 3 ⒳㘃ߩᲧセࠍⴕߞߚޕ. 29.
(36) 6. 4 ታ㛎ߩ⚿ᨐߣ⠨ኤ 6.4.1. C4.5 ߣ BC4.5 ߣߩᲧセ. anneal audiology auto breast chess cleve diabetes DNA flare glass heart hepatitis horse- colic hypothyroid ionosphere iris labor- neg letter lymphography segment shuttle sick- euthyroid sonar soybean- large splice vehicle vote waveform- 21 wine zoo ave. C4.5 un- pruned pruned err err 0.0578 0.0793 0.2296 0.2213 0.1762 0.1962 0.0592 0.0557 0.0056 0.0057 0.2356 0.2438 0.2593 0.2600 0.0835 0.0761 0.1884 0.1728 0.3169 0.3169 0.2478 0.2178 0.2133 0.2043 0.1742 0.1459 0.0078 0.0073 0.1075 0.1059 0.0500 0.0580 0.2180 0.2193 0.1190 0.1197 0.2601 0.2348 0.0339 0.0333 0.0002 0.0003 0.0238 0.0213 0.2725 0.2592 0.0918 0.0845 0.0789 0.0576 0.2789 0.2726 0.0545 0.0476 0.2376 0.2355 0.0711 0.0732 0.0731 0.0813 0.1409. 0.1369. BC4.5 un- pruned T err 10.0 0.0508 9.9 0.2230 10.0 0.1640 10.0 0.0392 10.0 0.0056 10.0 0.1990 10.0 0.2517 10.0 0.0832 9.9 0.1769 10.0 0.2541 10.0 0.1996 10.0 0.1796 10.0 0.1601 10.0 0.0087 10.0 0.0730 9.7 0.0553 9.8 0.1570 10.0 0.0531 10.0 0.1992 10.0 0.0206 9.2 0.0002 10.0 0.0226 9.9 0.1971 10.0 0.0729 9.9 0.0677 10.0 0.2457 10.0 0.0517 10.0 0.1721 8.6 0.0502 2.3 0.0691. pruned T err 10.0 0.0569 10.0 0.2551 10.0 0.1821 10.0 0.0388 10.0 0.0074 10.0 0.2064 10.0 0.2486 10.0 0.1744 10.0 0.1708 10.0 0.2509 10.0 0.1915 10.0 0.1808 10.0 0.1513 10.0 0.0080 10.0 0.0718 10.0 0.0600 10.0 0.1977 10.0 0.0552 10.0 0.2092 10.0 0.0215 9.9 0.0002 10.0 0.0219 9.9 0.1816 10.0 0.0594 10.0 0.0625 10.0 0.2434 10.0 0.0435 10.0 0.1737 9.4 0.0440 8.6 0.1180. 9.6. 9.9. 0.1168. 0.1229. 㧢㧚㧞 C4.5 ߣ BC4.5 ߣߩᲧセ BC4.5 ߩ T ߪࡉࠬ࠹ࠖࡦࠣߩ⹜ⴕ࿁ᢙߩᐔဋߢࠆޕ. 30.
(37) 6㧚2 ࠃࠅޔC4.5 ߦ㑐ߒߡߪߘߩᕈ⢻ㅢࠅ un-pruned tree ࠃࠅ߽ pruned tree ߩᣇ߇ࠃ⚿ᨐߣߥߞߚߚޔC4.5 ߩ Pruned tree ߣ 2 ⒳㘃ߩ BC4.5 ߩᲧセࠍⴕߞ ߚޕ. C4.5とUnprunedBC4.5との比較. 0.3. C4.5. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. Unpruned BC4.5 ࿑㧢㧚㧝 Pruned C4.5 ߣ Un-pruned BC4.5 ߣߩᲧセ. Un-pruned BC4.5 ߦ㑐ߒߡߪޔC4.5 ߩ pruned tree ߦኻߒߡ 30 ਛ 23 ߩ࠺ ࠲ߢࠛ₸ߩᷫዋ߇ࠄࠇߚޔߚ߹ޕᐔဋ୯ࠍᲧߴࠆߣ⚂ 2.0㧑ߩࠛ₸߇ᷫዋ ߺࠄࠇߚޕ࿑㧢㧚㧝ߩࠣࡈߪޔx ゲߦ Un-pruned BC4.5 ߩࠛ₸ࠍߣࠅޔy ゲߦ Pruned C4.5 ߩࠛ₸ࠍขߞߚࠣࡈߢࠆޕUn-pruned BC4.5 ߪ 7 ߩ࠺࠲ ߢࠛ₸ߩᖡൻ߇ߞߚ߇ࠅࠃࠍࡈࠣߩߎޔᄢߥᖡൻߪߥ߇⏕ߐࠇߚޕ. 31.
(38) C4.5とPrunedBC4.5との比較. 0.3. C4.5. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. Pruned BC4.5 ࿑ 6㧚2. Pruned C4.5 ߣ Pruned BC4.5 ߣߩᲧセ. Pruned BC4.5 ߦ㑐ߒߡߪޔC4.5 ߩ pruned tree ߦኻߒߡ 30 ਛ 2㧠ߩ࠺࠲ ߢࠛ₸ߩᷫዋ߇ࠄࠇߚޔߚ߹ޕᐔဋ୯ࠍᲧߴࠆߣ⚂ 1.5㧑ߩࠛ₸ߩᷫዋ߇ ࠄࠇߚޕ ࿑㧢㧚 㧞ߩࠣࡈߪޔ x ゲߦ Pruned BC4.5 ߩࠛ₸ࠍߣࠅޔ y ゲߦ Pruned C4.5 ߩࠛ₸ࠍขߞߚࠣࡈߢࠆޕPruned BC4.5 ߪ 6 ߩ࠺࠲ߢࠛ₸ߩ ᖡൻ߇ߞߚ߇߇₸ࠛࠅࠃࠍࡈࠣߩߎޔᄢߦჇᄢߒߡࠆ࠺࠲߇ 3 ⒳㘃 㧔audiology, DNA, zoo㧕⏕ߐࠇߚޕ. 32.
(39) UnprunedBC4.5とPrunedBC4.5との比較. Unpruned BC4.5. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. Pruned BC4.5 ࿑㧢㧚㧟 BC4.5 ߩ Un-pruned ߣ Pruned ߩᲧセ BC4.5 ߩ Un-pruned ߣ Pruned ߩਔᣇߣ߽ C4.5 ࠃࠅ߽⦟⚿ᨐߣߥߞߚޕᐔဋߩ ࠛ₸ߩᷫዋࠍࠆߣ AdaBoost ߪޔC4.5 ߩ Pruned tree ࠃࠅ߽ Un-pruned tree ߳ㆡ↪ߒߚᣇ߇⦟⚿ᨐ߇ᓧࠄࠇࠆߎߣ߇ಽࠆޕ࿑ 6㧚3 ࠃࠅޔBC4.5 ߩ Un-pruned ߣ Pruned ߢߪޔPruned BC4.5 ߇ Un-pruned BC4.5 ࠃࠅ߽⦟⚿ᨐ߇ᓧࠄࠇߡ ࠆ࠺࠲߽ᄙᢙࠆ߇ޔᢙ⒳㘃ߩ࠺࠲ߢߪ Un-pruned BC4.5 ࠃࠅ߽ࠛ₸߇ᄢ ߦჇᄢߒߡࠆ࠺࠲߇ฃߌࠄࠇߚ✚ࠄ߆ߣߎߩࠄࠇߎޕว⊛ߦ AdaBoost ߪޔ C4.5 ߩ Pruned tree ࠃࠅ߽ Un-pruned tree ߦㆡ↪ߒߚᣇ߇ലᨐ⊛ࠆߎߣ߇ಽࠆޕ. 33.
(40) 6.4.2. BC4.5 ߣ BC4.5_M1 ߣߩᲧセ. ೨▵ߢ BC4.5 ߩ Un-pruned ߣ Pruned ߢߪޔUn-pruned ߩᣇ߇⦟⚿ᨐ߇ߡ ࠆߩߢޔBC4.5_M1 ߣ Un-pruned BC4.5 ࠍᲧセߔࠆߎߣߦߒߚޕ BC4.5 Un- pruned T err anneal 10.0 0.0508 audiology 9.9 0.2230 auto 10.0 0.1640 breast 10.0 0.0392 chess 10.0 0.0056 cleve 10.0 0.1990 diabetes 10.0 0.2517 DNA 10.0 0.0832 flare 9.9 0.1769 glass 10.0 0.2541 heart 10.0 0.1996 hepatitis 10.0 0.1796 horse- colic 10.0 0.1601 hypothyroid 10.0 0.0087 ionosphere 10.0 0.0730 iris 9.7 0.0553 labor- neg 9.8 0.1570 letter 10.0 0.0531 lymphography 10.0 0.1992 segment 10.0 0.0206 shuttle 9.2 0.0002 sick- euthyroid 10.0 0.0226 sonar 9.9 0.1971 soybean- large 10.0 0.0729 splice 9.9 0.0677 vehicle 10.0 0.2457 vote 10.0 0.0517 waveform- 21 10.0 0.1721 wine 8.6 0.0502 zoo 2.3 0.0691. BC4.5_M1 Un- pruned Pruned T err T err 6.7 0.0597 10.0 0.1173 1.1 0.2305 7.4 0.2612 10.0 0.1912 9.8 0.2468 9.0 0.0416 10.0 0.0392 10.0 0.0056 10.0 0.0057 10.0 0.1983 10.0 0.1962 10.0 0.2530 10.0 0.2462 5.7 0.0838 10.0 0.1579 1.0 0.1884 10.0 0.1708 9.3 0.2803 10.0 0.2783 10.0 0.2004 10.0 0.1848 5.4 0.2027 10.0 0.2053 9.9 0.1601 10.0 0.1434 4.6 0.0083 10.0 0.0165 10.0 0.0778 10.0 0.0794 9.9 0.0573 10.0 0.0633 9.9 0.1720 10.0 0.2280 10.0 0.0635 10.0 0.0639 7.7 0.2297 10.0 0.2373 10.0 0.0239 10.0 0.0248 9.7 0.0003 10.0 0.0010 3.2 0.0232 10.0 0.0329 9.9 0.2005 10.0 0.1937 8.9 0.0807 10.0 0.0791 9.3 0.0750 10.0 0.0810 10.0 0.2461 10.0 0.2420 10.0 0.0545 10.0 0.0440 10.0 0.1783 10.0 0.1764 9.8 0.0489 9.8 0.0512 2.0 0.0788 5.1 0.1447. ave. 8.1. 9.6. 0.1168. 34. 0.1238. 9.7. 0.1337.
(41) 6㧚3 BC4.5 ߣ BC4.5_M1 ߣߩᲧセ. UnprunedBC4.5とUnprunedBC4.5_M1との比較. UnprunedBC4.5. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. UnprunedBC4.5_M1 ࿑ 6㧚4 Un-pruned BC4.5 ߣ Un-pruned BC4.5_M1 ߣߩᲧセ. Un-pruned BC4.5_M1 ߪޔC4.5 ࠃࠅߪࠛ₸ߪ⦟ߊߥߞߡࠆޔߒ߆ߒޕ Un-pruned BC4.5 ߦኻߒߡ 30 ਛ 27 ߩ࠺࠲ߢࠛ₸ߩჇട߇ࠄࠇޔᐔဋ ୯ࠍᲧߴࠆߣ⚂ 0.8㧑ߩࠛ₸߇Ⴧടߣߥߞߡࠆޕ࿑㧢㧚㧠ߩࠣࡈߪޔx ゲߦ Un-pruned BC4.5_M1 ߩࠛ₸ࠍߣࠅޔy ゲߦ Un-pruned BC4.5 ߩࠛ₸ࠍข ߞߚࠣࡈߢࠆޔࠅࠃࡈࠣޕUn-pruned BC4.5_M1 ߪ․ߦᄢ߈ߥᖡൻߪࠄࠇ ߥ߆ߞߚ߇ߦ࠲࠺ߩߡߴߔ߷߶ޔኻߒߡࠛ₸߇Ⴧടߒߡߒ߹ߞߚޕ. 35.
(42) UnprunedBC4.5とPrunedBC4.5_M1との比較. UnprunedBC4.5. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. PrunedBC4.5_M1. ࿑ 6㧚5 Un-prunedBC4.5 ߣ PrunedBC4.5_M1 ߣߩᲧセ. Pruned BC4.5_M1 ߽ޔC4.5 ࠃࠅߪࠛ₸ߪ⦟ߊߥߞߡࠆޔߒ߆ߒޕUn-pruned BC4.5 ߦኻߒߡ 30 ਛ㧣ߩ࠺࠲ߢࠛ₸ߩᷫዋ߇ࠄࠇޔUn-pruned BC4.5_M1 ࠃࠅߪࠛ₸߇ᷫዋߒߡࠆ࠺࠲ߪᄙ߇⏕ߐࠇߚޔߒ߆ߒޕᐔ ဋ୯ࠍᲧߴࠆߣ Un-pruned BC4.5 ߣᲧߴࠛ₸߇⚂ 1.7㧑߽Ⴧടߒߡࠆޕ ࿑㧢㧚 5 ߩࠣࡈߪޔx ゲߦ Pruned BC4._M1 ߩࠛ₸ࠍߣࠅޔy ゲߦ Un-pruned BC4.5 ߩࠛ₸ࠍขߞߚࠣࡈߢࠆޔࠅࠃࡈࠣޕPruned BC4.5_M1 ߪ߇₸ࠛޔ ᄢ߈ߊᖡൻߒߡࠆ࠺࠲߇ߛߞߡࠆోޕ⊛ߦߪ⚿ߥߊ⦟ࠅ߹ޔᨐߣߥߞ ߚ߇Ყセߒߚಽᵹࠪࠬ࠹ࡓ㧔6 ⒳㘃㧕ߩߥ߆ߢ 4 ߟߩ࠺࠲ߦ㑐ߒߡࠛ₸߇ ᦨዊߦߥߞߡࠆ࠺࠲߽ߺࠄࠇߚޕ. 36.
(43) UnprunedBC4.5_M1とPrunedBC4.5_M1との比較. Unpruned BC4.5_M1. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. PrunedBC4.5_M1 ࿑㧢㧚㧢 Un-prunedBC4.5_M1 ߣ PrunedBC4.5_M1 ߣߩᲧセ. BC4.5_M1 ߩ Un-pruned ߣ Pruned ߩਔᣇߣ߽ C4.5 ࠃࠅ߽⦟⚿ᨐߣߥߞߚߒޕ ߆ߒޔਔᣇߣ߽ BC4.5 ࠃࠅ߽ᖡ⚿ᨐߣߥߞߚޕ࿑ 6㧚㧢ࠃࠅޔBC4.5_M1 ߩ Un-pruned ߣ Pruned ߢߪޔPruned BC4.5_M1 ߇ Un-pruned BC4.5_M1 ࠃࠅ߽⦟ ⚿ ᨐ ߇ ᓧ ࠄ ࠇ ߡ ࠆ ࠺ ࠲ ߪ ᄙ ᢙ ࠆ ߇ ޔᢙ ⒳ 㘃 ߩ ࠺ ࠲ ߢ ߪ Un-pruned BC4.5_M1 ࠃ ࠅ ߽ ࠛ ₸ ߇ ᄢ ߦ Ⴧ ᄢ ߒ ߚ ✚ ࠄ ߆ ߣ ߎ ߩ ࠄ ࠇ ߎ ޕว ⊛ ߦ AdaBoost_M1 ߪޔC4.5 ߩ Pruned tree ࠃࠅ߽ Un-pruned tree ߦㆡ↪ߒߚᣇ߇ലᨐ ⊛ߢࠆߎߣ߇ಽࠆޕ. 37.
(44) 6.4.3. BC4.5 ߣ BC4.5_M2 ߣߩᲧセ. ೨▵ห᭽ޔBC4.5_M2 ߣ Un-pruned BC4.5 ࠍᲧセߔࠆޕ BC4.5 Un- pruned T err anneal 10.0 0.0508 audiology 9.9 0.2230 auto 10.0 0.1640 breast 10.0 0.0392 chess 10.0 0.0056 cleve 10.0 0.1990 diabetes 10.0 0.2517 DNA 10.0 0.0832 flare 9.9 0.1769 glass 10.0 0.2541 heart 10.0 0.1996 hepatitis 10.0 0.1796 horse- colic 10.0 0.1601 hypothyroid 10.0 0.0087 ionosphere 10.0 0.0730 iris 9.7 0.0553 labor- neg 9.8 0.1570 letter 10.0 0.0531 lymphography 10.0 0.1992 segment 10.0 0.0206 shuttle 9.2 0.0002 sick- euthyroid 10.0 0.0226 sonar 9.9 0.1971 soybean- large 10.0 0.0729 splice 9.9 0.0677 vehicle 10.0 0.2457 vote 10.0 0.0517 waveform- 21 10.0 0.1721 wine 8.6 0.0502 zoo 2.3 0.0691. BC4.5_M2 Un- pruned Pruned T err T err 0.9 0.0578 0.9 0.0793 0.0 0.2296 0.0 0.2213 1.0 0.1762 1.0 0.1962 10.0 0.0382 10.0 0.0373 10.0 0.0058 10.0 0.0170 10.0 0.2428 10.0 0.2552 1.8 0.2591 1.9 0.2602 10.0 0.0742 10.0 0.0703 0.0 0.1884 0.0 0.1728 1.0 0.3169 1.0 0.3169 10.0 0.2100 10.0 0.1822 7.9 0.1853 2.0 0.2003 10.0 0.1573 10.0 0.1540 10.0 0.0138 10.0 0.0097 10.0 0.0813 10.0 0.0792 10.0 0.0500 10.0 0.0567 5.4 0.1720 1.7 0.2283 5.6 0.1106 9.1 0.1098 1.2 0.2601 0.9 0.2348 10.0 0.0290 9.9 0.0300 1.0 0.0002 1.2 0.0003 9.9 0.0298 10.0 0.0256 6.3 0.2480 7.7 0.2385 0.8 0.0918 0.8 0.0845 10.0 0.0541 10.0 0.0639 1.1 0.2789 1.2 0.2726 10.0 0.0446 10.0 0.0437 10.0 0.1857 10.0 0.1858 10.0 0.0591 10.0 0.0617 1.0 0.0731 1.0 0.0813. ave. 6.2. 9.6. 0.1168. 0.1308. 6.0. 6㧚4 BC4.5 ߣ BC4.5_M2 ߣߩᲧセ. 38. 0.1323.
(45) UnprunedBC4.5とUnprunedBC4.5_M2との比較. UnprunedBC4.5. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. UnprunedBC4.5_M2. ࿑ 6㧚7 Un-prunedBC4.5 ߣ Un-prunedBC4.5_M2 ߣߩᲧセ. Un-pruned BC4.5_M2 ߪޔC4.5 ࠃࠅߪࠛ₸ߪ⦟ߊߥߞߡࠆޔߒ߆ߒޕ Un-pruned BC4.5 ߦኻߒߡ 30 ਛ㧠ߩ࠺࠲ߢࠛ₸ߩᷫዋ߇ࠄࠇࠆ߇ޔᐔ ဋ୯ࠍᲧߴࠆߣ⚂ 1.4㧑ߩࠛ₸߇Ⴧടߣߥߞߡࠆ⹜ߩࠣࡦࠖ࠹ࠬࡉޔߚ߹ޕ ⴕ࿁ᢙ߽ㅜਛߢᱛ߹ߞߡߒ߹ޔቇ⠌߇ᱛ߹ߞߡߒ߹߁࠺࠲߇ᄙᢙሽߒߡߚޕ ߹ߚޔ࿑㧢㧚㧣ߩࠣࡈߪޔx ゲߦ Un-pruned BC4.5_M2 ߩࠛ₸ࠍߣࠅޔy ゲߦ Un-pruned BC4.5 ߩࠛ₸ࠍขߞߚࠣࡈߢࠆޔࠅࠃࡈࠣޕUn-pruned BC4.5_M1 ߪ߃ࠄ₸ߩᄢ߈ߥᖡൻ߇ߺࠄࠇߦ࠲࠺ߩߡߴߔ߷߶ߦࠄߐޔኻߒߡࠛ ₸߇ᖡൻߒߡࠆޕ. 39.
(46) UnprunedBC4.5とPrunedBC4.5_M2との比較. UnprunedBC4.5. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. PrunedBC4.5_M2. ࿑ 6㧚8 Un-prunedBC4.5 ߣ PrunedBC4.5_M2 ߣߩᲧセ. Un-pruned BC4.5_M2 ߪޔC4.5 ࠃࠅߪࠛ₸ߪ⦟ߊߥߞߡࠆޔߒ߆ߒޕ Un-pruned BC4.5 ߦኻߒߡ 30 ਛ㧢ߩ࠺࠲ߢࠛ₸ߩᷫዋ߇ࠄࠇࠆ߽ߩߩޔ ᐔဋ୯ࠍᲧߴࠆߣ⚂ 1.6㧑ߩࠛ₸߇Ⴧടߒߡࠆߩ₸ࠛޔߒ߆ߒޕᷫዋߒߚ ࠺࠲ਛߩ㧠ߩ࠺࠲ߢߪታ㛎ߢ↪ߚಽ㘃ࠪࠬ࠹ࡓߩਛߢ߽ߞߣ߽ࠛ₸߇ ૐߊߥߞߚޔߦࠄߐޕ2 ߩ࠺࠲ߢߪ AdaBoost ߢࠛ₸߇Ⴧᄢߒߡߒ߹ߞߚ࠺ ࠲ߦኻߒߡ߽ C4.5 ࠃࠅࠛ₸ߩᷫዋ߇ࠄࠇߚⴕ⹜ߩࠣࡦࠖ࠹ࠬࡉޔߚ߹ޕ ࿁ᢙ߽ㅜਛߢᱛ߹ߞߡߒ߹ޔቇ⠌߇ᱛ߹ߞߡߒ߹߁࠺࠲߇ᄙᢙሽߒߡߚޕ࿑ 㧢㧚㧣ߩࠣࡈߪޔx ゲ Pruned BC4.5_M2 ߩࠛ₸ࠍߣࠅޔy ゲߦ Un-pruned BC4.5 ߩࠛ₸ࠍขߞߚࠣࡈߢࠆޔࠅࠃࡈࠣޕUn-pruned BC4.5_M1 ߪ߃. 40.
(47) ࠄ₸ߩᄢ߈ߥᖡൻ߇ߺࠄࠇߦ࠲࠺ߩߡߴߔ߷߶ߦࠄߐޔኻߒߡࠛ₸߇ᖡൻߒ ߡࠆޕ. UnprunedBC4.5_M2とPrunedBC4.5_M2との比較. Unpruned BC4.5_M2. 0.3. 0.2. 0.1. 0.0 0.0. 0.1. 0.2. 0.3. PrunedBC4.5_M2. ࿑ 6㧚9 Un-prunedBC4.5_M2 ߣ PrunedBC4.5 ߣߩᲧセ. AdaBoost_M2 ߪޔC4.5 ߩࠛ₸ࠍᷫዋߐߖࠆߎߣ߇᧪ߚ߽ࠄߜߤޔߒ߆ߒޕ AdaBoost ࠃࠅߪޔᖡ⚿ᨐߣߥߞߚޕAdaBoost_M2 ߪޔUn-pruned tree ߦㆡ↪ߒ ߚಽ㘃ࠪࠬ࠹ࡓߩᣇ߇ࠛ₸ߩᐔဋߪ⦟߆ߞߚ߇ޔ࿑ 6㧚9 ࠍࠆߣ 2 ߟߩࠪࠬ࠹ ࡓߩᏅߪߩࠄࠇߎޕߚߞ߆ߥࠇࠄࠅ߹ޔಽ㘃ࠪࠬ࠹ࡓߪࡦࠖ࠹ࠬࡉ߽ࠄߜߤޔ ࠣߩቇ⠌࿁ᢙ߇ᱛ߹ߞߡߒ߹ቇ⠌߇ㅴ߹ߥ࠺࠲߇ᄙߊࠄࠇߚޕቇ⠌ࠍㅴࠆ ߚߦࠕ࡞ࠧ࠭ࡓߩᡷ⦟߇ᔅⷐߢࠆߣᗵߓߚޕ. 41.
(48) ╙ 7 ┨ ⚿⺰. ታ㛎ߩ⚿ᨐޔAdaBoost ߅ࠃ߮ᡷ⦟ߒߚ AdaBoost ߪޔC4.5 ߩ♖ᐲࠍะߐߖࠆߎ ߣ߇᧪ࠆߎߣ߇⏕ߐࠇߚޔߚ߹ޕC4.5 ߩ̌Pruned tree̍ࠃࠅ߽̌Un-pruned tree̍ ߦㆡ↪ߒߚᣇ߇ C4.5 ߩࠛ₸ࠍᷫዋߐߖࠄࠇࠆߎߣ߽⏕ߐࠇߚࡦࠖ࠹ࠬࡉޕ ࠣᚻᴺࠍ̌Un-pruned tree̍ߦㆡ↪ߒߚ႐วߪࠛ₸߇Ⴧᄢߒߡߒ߹߁࠺࠲࠶ ࠻߽ߞߚ߇ᭂ┵ߦ♖ᐲ߇⪭ߜࠆߎߣߪή߆ߞߚ߇ޔ ̌Pruned tree̍ߦㆡ↪ߒߚ႐ว ߪ߇₸ࠛޔᭂ┵ߦჇᄢߒߡߒ߹߁࠺࠲࠶࠻߇ᢙ⏕ߐࠇߚ̌ޕPruned tree̍ߪ̌Un-pruned tree̍ࠃࠅ߽♖ᐲߩ㜞⺑ߢࠆߩࠄࠇߎޕࠃࠅޔAdaBoost ߪ♖ᐲߩ㜞ߔ߉ࠆ⺑ߦኻߒߡߪലߢߥߣ⠨߃ࠄࠇࠆޔߒ߆ߒޕታ㛎ߩ⚿ᨐࠃࠅ නߦࠛ₸߇ૐ⺑ߦലߢߪߥߣ߁ᗧߢߪߥߊࠆߥߣࠬࡌޔቇ⠌ࠕ࡞ ࠧ࠭ࡓߩᕈ⢻߇㜞‛ߦኻߒߡߪࡉࠬ࠹ࠖࡦࠣᚻᴺ߇߹ࠅലߢߪߥߣ⸒ ߁ߎߣߢࠆޕ ᧄ⎇ⓥߢߪޔAdaBoost ࠕ࡞ࠧ࠭ࡓߩᡷ⦟߽ⴕߞߚ߇ᚑߒߚࠕ࡞ࠧ࠭ࡓߪ AdaBoost ࠃ ࠅ ߽ C4.5 ߩ ♖ ᐲ ࠍ ะ ߐ ߖ ࠆ ߎ ߣ ߇ ᧪ ߥ ߆ ߞ ߚ ޔ ߒ ߆ ߒ ޕ AdaBoost_M2 ࠍ̌Pruned tree̍ߦㆡ↪ߒߚࠪࠬ࠹ࡓߪࠛߩᷫዋ₸৻⇟ᖡ߆ߞߚ ߇࠺࠲࠶࠻ߦࠃߞߡߪߩ₸ࠛޔᷫዋ߇ᦨᄢߦߥࠆ႐ว߇ߞߚߚ߹ޕ AdaBoost ߇ലᨐ⊛ߢߪߥ࠺࠲࠶࠻ߦኻߒߡലߢࠆ႐ว߇ߞߚޕߩቇ ⠌ࠕ࡞ࠧ࠭ࡓߦࠃࠆታ㛎ߦࠄߐޔᡷ⦟ࠍട߃ࠆ߇ᔅⷐߛ߇ߪࡓ࠭ࠧ࡞ࠕߩߎޔ ᕈ⢻ߩ⦟ࠕ࡞ࠧ࠭ࡓࠆࠁࠊޔᒝቇ⠌ࠕ࡞ࠧ࠭ࡓߦኻߒߡലߦߥࠆߎߣ߇⠨ ߃ࠄࠇࠆޕ. 42.
(49) ᦨᓟߦᓟߩ⺖㗴ߣߒߡࠣࡦࠖ࠹ࠬࡉޔᚻᴺߪࡌࠬߣߥࠆቇ⠌ࠕ࡞ࠧ࠭ࡓߦ ࠃߞߡലᕈ߇ᄌൻߔࠆߎߣ߇⏕ߐࠇߚ߇ߩࠛߡߞࠃߦ࠻࠶࠲࠺ޔᷫዋ₸ ߩᏅ߇ᄙߎߣ߇⏕ߐࠇߚޔߒ߆ߒޕታ㛎ߢ↪ߒߚ࠺࠲࠶࠻ߢߪޔߩᢙޔ ዻᕈߩᢙߩ࡞ࡌޔᢙߩ㆑ߦࠃࠆ⏕ߥലᕈߩ㆑ߪ⏕ߔࠆߎߣߪ᧪ߥ߆ߞ ߚࠍࠣࡦࠖ࠹ࠬࡉޔࠅࠃߣߎߩࠄࠇߎޕലᨐ⊛ߦᡷ⦟ߔࠆߚߦߪ࠺࠲࠶࠻ߩ ㆑ߦࠃࠆലᕈߩᄌൻߩ⺞ᩏߦ߽ᵈ⋡ߔࠆᔅⷐ߇ࠆߣᗵߓߚޕ. 43.
(50) ⻢ㄉ. ᧄ⎇ⓥߩ㐿ᆎ߆ࠄ߹ߣ߹ߢ⎇ޔⓥߩో㕙ߦࠊߚࠅ৻⽾ߒߡߏᜰዉߏޔഥ⸒ࠍ㗂߈ ߹ߒߚ Ho Tu Bao ᢎߦᷓߊ߅␞ࠍ↳ߒߍ߹ߔ⎇ޕⓥߩᣇ㊎ߦߟߡഥ⸒ࠍ㗂߈ ߹ߒߚ⍹ፒ㓷ੱഥᢎߦᔃ߆ࠄᗵ⻢ߚߒ߹ߔ⎇ޕⓥࠍㅴࠆߦᒰߚࠅޔೋᱠ⊛ߥ⾰ ߆ࠄ⎇ⓥߩᣇᴺ߹ߢᥦ߆ߏᗧߣৼካߥߏᜰዉࠍ㗂߈߹ߒߚ Nguyen Dung Trong ഥᚻߦᔃ߆ࠄᗵ⻢ߚߒ߹ߔޕ ᦨᓟߦ⎇ᧄޔⓥࠍㅴࠆߦᒰߚࠅࠆࠁࠄޔ႐㕙ߦ߅ߡ⾆㊀ߥߏᗧࠍ㗂߈߹ߒ ߚ Ho-⍹ፒ⎇ⓥቶߩ⊝᭽ߦᗵ⻢ߚߒ߹ߔޕ. 44.
(51) ෳ ⠨ ᢥ ₂ [1]. Quinlan J.R., 1998, Bagging boosting and C4.5.. [2]. Robert E. Schapire, 1999, A Brief Introduction to Boosting.. [3]. Michael j. Kearns and Umesh V. Vazirani, 1994, An Introduction to Computational Learning Theory.. [4]. Yoav Freund, Robert E. Schapire, 1995, a decision-theoretic generalization of on-line learning and an application to boosting.. [5]. Yoav Freund, Robert E. Schapire, 1996, Experiments with a new boosting algorithm.. [6]. Robert E. Schapire, 1990, The strength of weak learn ability. Machine Learning.. [7]. Yoav Freund, Robert Schapire, (⸶㧦⋥᮸), 1999, ࡉࠬ࠹ࠖࡦࠣ㐷 (A Short Introduction to Boosting).. [8]. ࡑࠗࠤ࡞ J.A.ࡌ㧘ࠧ࠼ࡦࡁࡈ⪺㧘SAS ࠗࡦࠬ࠹ࠖ࠴ࡘ࠻㧘ᳯේᷕ㧘 ⮮ᩕ⸶㧘࠺࠲ࡑࠗࡦ࠾ࡦࠣᚻᴺ (Data Mining Techniques) ᶏᢥၴ 㧘㧝㧥㧥㧥ᐕ.. 45.
(52) [9]. J.R.ࠠࡦࡦ⪺㧘ฎᎹᐽ৻⋙⸶㧘AI ߦࠃࠆ࠺࠲⸃ᨆ (Programs for machine learning) ࠻࠶ࡄࡦ㧘1995 ᐕ.. [10] H.M.࠳ࠗ࠹࡞㧘P.J.࠳ࠗ࠹࡞⪺㧘ዊ᎑㓉৻⸶㧘Computer Science Textbook C ේ★ࡊࡠࠣࡒࡦࠣ㧘ࡊࠗࡦ࠹ࠬࡎ࡞ 㧘1998. [11] Jiawei Han and Micheline Kamber ( ⪺ ), Data Mining: Concepts and Techniques, The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor Morgan Kaufmann Publishers, 2000 ᐕ.. 46.
(53)
関連したドキュメント
A novel optical profiling method is proposed, which is nearly insensitive to vertical vibrations and able to measure the roughness of supersmooth surfaces on a long track.. This
We develop vibration measuring equipment using high accurate inclimeter sensor that was not used in the past studies related to MEMS sensor. Since high accurate inclimeter sensor
ImproV allows the users to mix multiple videos and to combine multiple video effects on VJing arbitrary by data flow editor. We employ a unified data type, we call, Video Type which
本節では本研究で実際にスレッドのトレースを行うた めに用いた Linux ftrace 及び ftrace を利用する Android Systrace について説明する.. 2.1
①物流品質を向上させたい ②冷蔵・冷凍の温度管理を徹底したい ③低コストの物流センターを使用したい ④24時間365日対応の運用したい
定可能性は大前提とした上で、どの程度の時間で、どの程度のメモリを用いれば計
電子式の検知機を用い て、配管等から漏れるフ ロンを検知する方法。検 知機の精度によるが、他
[r]