電子掲示板における発言情報を利用したコミュニケ
ーション阻害行為の検出手法の提案(3.2 第5回情報
シナジー研究会, 3. 研究活動報告)
著者
一藤 裕, 今野 将, 曽根 秀昭
雑誌名
年報
巻
6
ページ
88-94
発行年
2007-07
URL
http://hdl.handle.net/10097/48529
㔚ሶឝ␜᧼ߦ߅ߌࠆ⊒⸒ᖱႎࠍ↪ߒߚ
ࠦࡒࡘ࠾ࠤ࡚ࠪࡦ㒖ኂⴕὑߩ
ᬌᚻᴺߩឭ᩺
৻⮮ * ㊁ ** ᦥᩮ⑲ᤘ**
* ᧲ർᄢቇᄢቇ㒮ᖱႎ⑼ቇ⎇ⓥ⑼
** ᧲ർᄢቇᖱႎࠪ࠽ࠫࡦ࠲
ⷐ ޔࠗࡦ࠲ࡀ࠶࠻ߩ㔚ሶឝ␜᧼ߢߪޔࠦࡒࡘ࠾ࠤ࡚ࠪࡦࠍ㒖ኂߔࠆ⊒⸒߇ߚ߮ߚ ߮ߒߡࠆޕߎߩࠃ߁ߥ⊒⸒߇➅ࠅߐࠇࠆߣޔឝ␜᧼ߪ⨹ࠇޔ↪⠪ߩᷫዋߥߤᖡᓇ㗀ࠍ ߷ߔޕߘߩߚޔߎߩࠃ߁ߥ⊒⸒ࠍ⊒ߔࠆᚻᴺ߇ᔅⷐߣߐࠇߡࠆޕᧄ⺰ᢥߢߪޔࠦࡒࡘ࠾ ࠤ࡚ࠪࡦࠍ㒖ኂߔࠆ⊒⸒ߦࠃߊߔࠆන⺆ߦ⌕⋡ߔࠆޕ⊒⸒නߦߔࠆන⺆ߩᢙߣឝ␜ ᧼ోߢߒߚᢙߩ✚ᢙࠍ▚ߒޔ₸߆ࠄޔฦන⺆ߩ㊀ߺࠍ⸳ቯ⹏ଔࠍⴕޔࠦࡒࡘ࠾ ࠤ࡚ࠪࡦ㒖ኂⴕὑߩᬌࠍ⋡ᜰߔޕA detecting method of BBS vandalism based on
information of comments
Yu Ichifuji*, Susumu Konno**, Hideaki Sone**
* Graduate School of Information Sciences, Tohoku University
** Information Synergy Center, Tohoku University
Abstract Electronic bulletin board systems (BBS) have problems with vandalism. It is necessary for an operator to find such problems quickly when they happen. For detecting such vandalism in BBS, we focus on the words which are used for vandalism. We count the number of such words in each comment and in the BBS. Using two kinds of numbers, we calculate the appearance ratio. The weight of words is determined by such a ratio, and used to detect such vandalism. The results of the detection are shown, and the efficiency of finding such vandalism is discussed.
1
㔚ሶឝ␜᧼䈮䈍䈔䉎⊒⸒
ᖱႎ䉕↪䈚䈢䉮䊚䊠䊆
䉬䊷䉲䊢䊮㒖ኂⴕὑ䈱ᬌ
ᚻᴺ䈱ឭ᩺
䂾৻⮮ †䇮㊁ ‡䇮ᦥᩮ ⑲ᤘ‡ † ᧲ർᄢቇᄢቇ㒮ᖱႎ⑼ቇ⎇ⓥ⑼ ‡ ᧲ർᄢቇᖱႎ䉲䊅䉳䊷䉶䊮䉺䊷 2⋡ᰴ
1. ᐨ⺰ 2. ⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ 3. ⨹䉂ᐲRF䈱⹏ଔᣇᴺ 4. ᬌ⸽ታ㛎 5. ⚿⺰ 31.1 ⢛᥊
㔚ሶឝ␜᧼ 㕖䊥䉝䊦䉺䉟䊛䈱ᗧ䊶ᖱႎ឵䉿䊷䊦 ᢎ⢒႐䊶ડᬺ䊶৻⥸䈭䈬ᐢ䈒↪ 1.ᐨ⺰ ᱜᏱ䈭䉮䊚䊠䊆䉬䊷䉲䊢䊮䉕㒖ኂ䈜䉎䊡䊷䉱䈏ሽ ᖡੱ 㒖ኂ 㔚ሶឝ␜᧼ 41.1 ⢛᥊
䉮䊚䊠䊆䉬䊷䉲䊢䊮㒖ኂⴕὑ䈱⒳㘃䋨⨹䉌䈚ⴕὑ䋩 1. 䉮䊚䊠䊆䉬䊷䉲䊢䊮䉕㒖ኂ䈜䉎ᦠ䈐ㄟ䉂 2. ઁ⠪䈮䉋䉎䊒䊤䉟䊋䉲䊷ᖱႎ䈱㐿 3. 㑛ⷩ⠪䉕ਇᔟ䈮䈘䈞䉎ᦠ䈐ㄟ䉂 4. ઁ⠪䉕⊒䈚䇮ᔕ䉕ᭉ䈚䉃ᦠ䈐ㄟ䉂(ᾜ䉍䊶㊒䉍) 5. ή↪䈱ᦠ䈐ㄟ䉂䉕ㅪ⛯䈪ⴕ䈇ᱜᏱ䈭㑛ⷩ䉕㒖ኂ䈜䉎 6. ᩮ䈱䈭䈇ᖱႎ䉕㐿䈚䇮ኻ⽎䉕⾇䉄䉎 1.ᐨ⺰ ▤ℂ⠪䈲⨹䉌䈚ⴕὑ䈱⊒↢䉕ᛠី䈜䉎ᔅⷐ䈏䈅䉎 51.2 ᣢሽ䉰䊷䊎䉴
ੱ⋙ⷞ䉰䊷䊎䉴䋨䊏䉾䊃䉪䊦䊷ᩣᑼળ␠䋩 NG䊪䊷䊄䉕⊓㍳ 24ᤨ㑆NG䊪䊷䊄䈏䈜䉎䈢䈶䈮ੱ㑆䈏⏕ •ὐ ᢥ⣂䉕ℂ⸃䈚್ᢿ䉕ਅ䈞䉎䈢䉄䇮⏕ታ䈮⊒น⢻ •ᰳὐ ੱઙ⾌䈏㜞䈒䇮ੱㆇ༡䈱ឝ␜᧼ะ䈐䈪䈲䈭䈇 ੱ㑆䉕䉒䈝䈮⥄േൻ䈪䈐䉏䈳䊶䊶䊶 1.ᐨ⺰ 61.3 ೨ᚻᴺ䈱⚫
⨹䉌䈚ⴕὑ⊒⥄േൻ䈱䈢䉄䈱䉝䊒䊨䊷䉼 ᢥ⣂ℂ⸃ NG䊪䊷䊄䊐䉞䊦䉺 ⊒⸒㑆䈱ଥ䉍ฃ䈔 ឝ␜᧼䈱ะ䉫䊤䊐ൻ ኻ⽎䈫䈜䉎⨹䉌䈚ⴕὑ 䉮䊚䊠䊆䉬䊷䉲䊢䊮䉕㒖ኂ䈜䉎ᦠ䈐ㄟ䉂 㑛ⷩ⠪䉕ਇᔟ䈮䈘䈞䉎ᦠ䈐ㄟ䉂 ઁ⠪䉕⊒䈚䇮ᔕ䉕ᭉ䈚䉃ᦠ䈐ㄟ䉂(ᾜ䉍䊶㊒䉍) න⚐䈭ᣇᴺ䈪⨹䉌䈚 ⴕὑ⊒䉕⋡ᜰ䈜89
-7
1.3 ೨ᚻᴺ䈱⚫
ኻ⽎ឝ␜᧼䈱ᒻᑼ ⊒⸒䈲⊒⸒䈚䈢㗅⇟䈮⸥ ․ቯ䈱⊒⸒䈮ኻ䈚⊒⸒䈜䉎႐ว䇮䉝䊮䉦䊷䈫䈳 䉏䉎⸥ภ䉕↪ 䋩 81.3 ೨ᚻᴺ䈱⚫
⨹䉂ᐲ䉕↪䈇䈢⋙ⷞᡰេᚻᴺ䋪 ⌕⋡ὐ ࣎ྸႎࢨ᪪ửɨảỦҥᛖ ႆᚕỉᡲᦋ ਫ਼ᅆெỉᩎൢίᒰỚࡇὸ 1.ᐨ⺰*:Yu Ichifuji, Susumu Konno, Hideaki Sone, “A method to monitor a BBS using feature extraction of text data”, International Conference on Human.Society@Internet, (2005) 349-352
䋩 ⊒⸒䈱ㅪ㎮ 9
1.3 ೨ᚻᴺ䈱⚫
⨹䉂ᐲ䉕↪䈇䈢⋙ⷞᡰេᚻᴺ ⨹䉂ᐲ(Ruination Figure [RF])䈫䈲 ឝ␜᧼䈱㔓࿐᳇䉕ᢙ୯ൻ䈚䈢ᜰᮡ ᅢᗵ䊶ህᖡᗵ䉕ਈ䈋䉎න⺆ ⊒⸒䈱ㅪ㎮ න⺆䈮䉋䉎 ᓇ㗀ജ ⊒⸒䈱ㅪ㎮ 䈮䉋䉎ᓇ㗀ജ ⊒⸒䈱 ᓇ㗀ജ ⊒⸒䈱 ᓇ㗀ജ¦
⨹䉂ᐲ RF 1.ᐨ⺰ 101.3 ೨ᚻᴺ䈱⚫
⨹䉂ᐲ䉕▚䈜䉎䈢䉄䈱ቯ⟵ cn nww pww nw pw 䋺⋧ᚻ䈮ᅢᗵ䉕ਈ䈋䉎න⺆䈱㓸ว(ㄉᦠ) 䋺⋧ᚻ䈮ህᖡᗵ䉕ਈ䈋䉎න⺆䈱㓸ว(ㄉᦠ) 䋺pw⊓㍳න⺆䈏ᜬ䈧㊀䉂(tf-idfᴺ䈪▚) 䋺nw⊓㍳න⺆䈏ᜬ䈧㊀䉂(tf-idfᴺ䈪▚) 䋺৻䈧䈱⊒⸒䈮䈍䈇䈩ฦන⺆䈏䈚䈢࿁ᢙ Ws (Word score) න⺆䈮䉋䉎⊒⸒䈱ᓇ㗀ജ 1.ᐨ⺰¦
¦
pw i nw j j cn j nww i cn i pww t Ws() () () ( ) ( ) 111.3 ೨ᚻᴺ䈱⚫
ccs (Comment Chain Score)
⊒⸒䈱ㅪ㎮ᢙ䈎䉌▚䈜䉎⊒⸒䈱ᓇ㗀ജ )} ( log{ ) (t i Res t ccs ⊒⸒࠽ࡦࡃ ߦଐሽ : ) ( : 1 t t Ws i r 3 ߩ⊒⸒ߦߪ4ߟߩࠬ 119ߩ⊒⸒ߦߪ1ߟߩࠬ 䋺⊒⸒ t 䈱ㅪ㎮ᢙ ) (t es R 1 ) 119 ( 4 ) 3 ( Res es R 1.ᐨ⺰ 12
1.3 ೨ᚻᴺ䈱⚫
Ss (Statement Score) ฦ⊒⸒䈏ਈ䈋䉎ᓇ㗀ജ RF (Ruination Figure) ឝ␜᧼䈱⹏ଔᜰᮡ ⊒⸒࠽ࡦࡃ ߦଐሽ : ) ( : 1 t t Ws i r¦
tSs j t RF() 1 ( ) |) (| |)} (| ) ( { ) ( ) ( ccs Max Ws Max t ccs t Ws t Ss 1.ᐨ⺰13
1.3 ೨ᚻᴺ䈱⚫
೨ᚻᴺ䈱㗴ὐ ⨹䉌䈚ⴕὑ䈜䈼䈩䉕⊒䈪䈐䈭䈇 䋨ᚻᴺ䈏න⚐䈜䈑䉎䈢䉄䋩 䉮䊚䊠䊆䉬䊷䉲䊢䊮䉕㒖ኂ䈜䉎ᦠ䈐ㄟ䉂 㑛ⷩ⠪䉕ਇᔟ䈮䈘䈞䉎ᦠ䈐ㄟ䉂 ઁ⠪䉕⊒䈚䇮ᔕ䉕ᭉ䈚䉃ᦠ䈐ㄟ䉂(ᾜ䉍䊶㊒䉍) ⨹䉌䈚ⴕὑ䉕․ᓽ䈪ಽ㘃䈚䇮䈠䉏䈡䉏䈱․ᓽ䈮䈅䈦䈢⨹ 䉌䈚ⴕὑ䉕⊒䈜䉎ᚻᴺ䉕⏕┙䈜䉎䇯ᦨ⚳⊛䈮ⶄᢙ䈱ᚻ ᴺ䈱⚵䉂ว䉒䈞䈪᭽䇱䈭⨹䉌䈚ⴕὑ䈱⊒䉕ታ䈜䉎 1.ᐨ⺰ 141.4 ᧄ⊒䈱⋡⊛
ห৻ੱ‛䈏ⶄᢙ࿁㑐ਈ䈜䉎⨹䉌䈚ⴕὑ⊒ 䈱䈢䉄䈱ᚻᴺ䉕ឭ᩺䈜䉎 ㊒䉍䊶ᾜ䉍 ᗧ䈮Ἳ䋨䊐䊧䊷䊛䋩䉕ᒁ䈐䈖䈜ⴕὑ ณ䈐 ᾜ䉍䉇⨹䉌䈚䈮ᔕ䈚䇮⋧ᚻ䉕㕖㔍䊶♾ᒢ䊶ᜰᒢ䈜䉎ⴕὑ ⟒䉍ว䈇䋨⥄⥄Ṷ䉅䉃䋩 ห৻ੱ‛䈏ⶄᢙ䈱ੱ㑆䈏ⴕ䈦䈩䈇䉎䉋䈉䈮䈞䉎ⴕὑ ᢙੱ䈏⸒䈇䈇䉕ⴕ䈇ઁ䈱㑛ⷩ⠪䉕ਇᔟ䈮䈘䈞䉎ⴕὑ 1.ᐨ⺰ ห৻ੱ‛䈏ⶄᢙ࿁䈜䉎⨹䉌䈚ⴕὑ䈱 151.5 ฦ⨹䉌䈚ⴕὑ䈱ቯ⟵
㊒䉍䊶ᾜ䉍䊶ณ䈐䈱 1.ᐨ⺰ ㊒䉍 ณ䈐 ᾜ䉍 or ณ䈐 ᾜ䉍 or ณ䈐 162┨ ⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ
172.1 ⨹䉌䈚ⴕὑ⊒䈱Ḱ
⌕⋡ὐ ႆᚕᎍID ⊒⸒⠪ID䈱ቯ⟵ IP䉝䊄䊧䉴䋨ᦠ䈐ㄟ䉃ឝ␜᧼䉇ᣣઃ䋩䈮䉋䉍ᚑ ID䈲䈮ઃਈ ห৻ID䋽ห৻ੱ‛䈫䉂䈭䈜 ⶄᢙ࿁⊒⸒䈚䈩䈇䉎ID䈱Ss䉕ᒝ⺞䈜䉏䈳 ኻ⽎⨹䉌䈚ⴕὑ䈱⊒䈏䉋䉍ኈᤃ䈮䈭䉎䈱䈪䈲䋿 2.⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ 䋩 ⊒⸒⠪ID ฬ೨ ⊒⸒⠪䈏⥄↱ 䈮ቯ 182.1 ⨹䉌䈚ⴕὑ⊒䈱Ḱ
ⶄᢙ࿁ห৻ID䈏㑐ਈ䈚䈢⨹䉌䈚ⴕὑ䈱․ᓽ䊶ಽ㘃 CASE1 䈅䉎⊒⸒䉕ⴕ䈇䇮৻ᢧ䈮⺰䉕ฃ䈔䇮䈠䉏䉕䉄䈝ᔕᚢ䈚ਇᔟ䈭 න⺆䉕䈇⨹䉏䉎䇯╵䈱ᣇ䊶⊒⸒⠪䈮䉋䈦䈩䈘䉌䈮ಽ㘃 CASE2 ㊒䉍䉕ⴕ䈇䇮੍ᗐ▸࿐ౝ䈱ᔕ䉕ᓧ䈩⨹䉏䉎orณ䈐䈏⊒↢䈚⨹ 䉏䉎 CASE3 ৻ੱ䈱ੱ‛䈏⣂⛊䈭䈒ⶄᢙ䈱⊒⸒䈮ኻ䈚ᢥฏ䉕䈧䈔䉎⊒⸒䉕ㅪ ᛩ䈜䉎䇯 2.⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ91
-19
2.1 ⨹䉌䈚ⴕὑ⊒䈱Ḱ
CASE1 1ኻᄙᢙ ི䉂ઃ䈎䉏䈢ੱ㑆䈏ౣᐲ䈜䉎น⢻ᕈᄢ ི䉂ઃ䈇䈢ੱ㑆䈏ౣᐲ䈜䉎น⢻ᕈ䉅 ╵䈱ᣇ䈏2䊌䉺䊷䊮 䊌䉺䊷䊮1䋺৻䈧䈱⊒⸒䈪ో䈩䈮╵ 䊌䉺䊷䊮2䋺৻䈧䈱⊒⸒䈪৻䈧䈱╵䇯 ㅪ⛯⊒⸒䈮䈭䉎 2.⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ 202.1 ⨹䉌䈚ⴕὑ⊒䈱Ḱ
CASE2 ㊒䈦䈢ੱ㑆䈲ᔕ䈚䈢ੱ㑆䈮ኻ䈚䈩⟒ୟ䈜䉎䈢 䉄䇮ห৻ੱ‛䈏ౣᐲ䈜䉎 CASE3 ㅪ⛯⊒⸒䈱䈢䉄䇮ห৻ੱ‛䈏➅䉍䈚䈜䉎 CASE1,2,3䉕⊒⸒ID䈮⌕⋡䈚䇮ᣂ䈢䈭▚ᣇᴺ䉕ឭ᩺䈜 䉎䈖䈫䈮䉋䉍⨹䉌䈚ⴕὑ䈱⊒䉕⋡ᜰ䈜 2.⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ 212.2 ឭ᩺ᚻᴺ䈱▚ᣇᴺ
▚ᚻ㗅 1. ID䈮⊒⸒࿁ᢙ䉕▚ 2. N࿁એ䈎䈧N࿁䈱⊒⸒䈱Ss䈱✚䈏⽶䈱୯ 䈱䉂T䈚䇮Ss㵭䈫䈜䉎 3. Ss㵭䉕↪䈇䈩RF䉕▚䈚䇮䉫䊤䊐ജ䈚⋡ⷞ䈮 䈩⹏ଔ䈜䉎 2.⊒⸒⠪ID䈮⌕⋡䈚䈢ᣂᚻᴺ න⺆䈮䉋䉎ᓇ㗀ജ ⊒⸒䈱ㅪ㎮䈮䉋䉎ᓇ㗀ജ Ss(t) ⊒⸒࿁ᢙ䈏N࿁એ䈎䈧Ss䈱✚䈏⽶䈱䈫䈐 Ss(t)㵭=T*Ss(t) 223┨ ⨹䉂ᐲRF䈱⹏ଔᣇᴺ
233. ⨹䉂ᐲ䈱⹏ଔᣇᴺ
RF䈱ᣇᴺ RF䉕10ಾ䉍䈮ಽ䈔䉎 (RF(1)~RF(10)䇮RF(11)~RF(20)䇮䊶䊶䊶) 1ಾ䉍䈗䈫䈮ᦨᄢ୯䇮ᦨዊ୯䇮ᆎ୯䇮⚳୯䉕 䈚䊨䊷䉸䉪⿷(ᩣଔ䈱䈪䉒䉏䉎)䈪 䋩 3.⨹䉂ᐲRF䈱⹏ଔᣇᴺ ᆎ୯ ⚳୯ ⚳୯ ᆎ୯ ᦨᄢ୯ ᦨዊ୯ ᦨᄢ୯ ᦨዊ୯ 243. ⨹䉂ᐲ䈱⹏ଔᣇᴺ
⌕⋡䈜䈼䈐䊨䊷䉸䉪⿷ 䈖䉏䉌䈱䉝䉟䊁䊛䈏ઁ䈮Ყ䈼㐳䈒䇮ኒ㓸䈚 䈩䈇䉎▸࿐䈏⨹䉏䈩䈇䉎⇼䈇䈱䈅䉎▸࿐25
3. ⨹䉂ᐲ䈱⹏ଔᣇᴺ
⨹䉌䈚ⴕὑ䈱⊒ ⨹䉂ᐲ䈱ᄌൻ䈱ᐲว䈇䈎䉌 㽲䈪䈲䇮✭䉇䈎䈮 㽳䈪䇮ᕆỗ䈮ਅ㒠 㽴䈪䈲䇮✭䉇䈎䈮ਅ㒠 㽳䈪㗴䈱ォ឵䈏⊒↢ ᕆỗ䈮ਅ㒠䈚䈢䈢䉄 䇸⨹䉌䈚ⴕὑ䈏⊒↢䈚䈢䈱䈪䈲䋿䇹 䈫⋧ኻ⊛䈮್ᢿ䈜䉎䈖䈫䈫䈜䉎 1 2 3 3.⨹䉂ᐲRF䈱⹏ଔᣇᴺ 264┨ ᬌ⸽ታ㛎
274.1 ታ㛎ኻ⽎
ኻ⽎ឝ␜᧼䈫䈠䈱⹏ଔ 㵰䋲䈤䉆䉖䈰䉎㵱䈱ᄢቇฃ㛎ឝ␜᧼䈮㒢ቯ ਥⷰ⹏ଔ ታ㓙䈮⺒䉂䇮⨹䉏䈩䈇䉎▸࿐䉕 4ੱ䈱ቇ↢䈮䉋䉎್ᢿ ᬌ⸽ᣇᴺ ೨ᚻᴺ䈫ឭ᩺ᚻᴺ䈱N,T䉕ᄌൻ䈘䈞䈢႐ว䈫䈱 Ყセ 4.ᬌ⸽ታ㛎 284.2 ⹏ଔ
␜ឝ␜᧼ 㵰᧲ർᄢቇ ℂ♽ኾ↪ part2㵱 ⨹䉌䈚ⴕὑ䈱▸࿐ 5-22 ห৻ID䈏䈜䉎ᾜ䉍䊶ณ䈐䈏ሽ 110-122 䈮䉋䉎ᾜ䉍䊶ณ䈐䈫ㅢᏱ 131-152 ห৻ID䈏䈜䉎ᾜ䉍䊶ณ䈐䈫ㅢᏱ 4.ᬌ⸽ታ㛎 294.3.1 ೨ᚻᴺ䈮䉋䉎ജ⚿ᨐ
⊒⸒⇟ภ䋨㬍10䋩 RF ਥ ⋙ 4.ᬌ⸽ታ㛎 304.3.2 ᣂᚻᴺ䈱ജ⚿ᨐ䋨T=2,N=3䋩
⊒⸒⇟ภ䋨㬍10䋩 RF ਥ ⋙ 4.ᬌ⸽ታ㛎93
-31