早稲田大学大学院理工学研究科
博 士 論 文 概 要
論 文 題 目
Internet Flow Analysis by an Extended Protocol Machine
拡張プロトコルマシンによるインターネットのフロー分析
申 請 者
Heshmatollah Khosravi
へシマトラ コスラビ
氏 名
専攻・研究指導
(課程内のみ) 情報科学専攻 情報アーキテクチャ
2004 年 12 月
I P p a ck e t s ar e t h e b a s i c co mp o n e n t i n th e Int er n e t co mmun i c ati o ns. A log ic a l co m mu n i c at io n l in k b et w e en t wo n o d e s is c al le d a f lo w wh i ch i s a s e ri e s o f p a ck et s.
N e t w o rk a d m i n i s t r a t o r s h a v e u s e d p a c k e t a n a l y s i s f o r t r o u b l e s h o o t i n g a n d i n v e s t i g a t i n g the l ink s t atu s . I n r e c ent y e ar s , n et wo rk at t a cking ha s in cr e a s ed. P a ck e t an aly s e s a r e b e c o m i n g m o r e c o m m o n i n i n t r u s i o n d e t e c t i o n sy stems. Ho wev er, th e existin g method s ar e on ly a b le to lo ok for t h e k n o wn sig n a t ur es and p at t ern s i n the p a ck e t s. T h ey la ck th e a b i l i t y t o d e t e c t t h e a n o ma l i e s . T h i s t h e s i s p ro p o s e s a n e w m e t h o d f o r n et w o r k monitoring that is based on flow analy sis.
T h i s p ap e r a n a l y s e s n e t w o rk t r a ff i c n o t o n t h e p a c k e t - b y -p a c k e t b as i s , b u t b y u s i n g g r o u p o f p a c k e t s c a l l e d a f l o w. A f l o w c o n s i s t s o f a g r o u p o f p a c k e t s t h a t h av e i d e n t i c a l p o r t s a n d I P a d d re s s e s a n d a r r i v e c l o s e t o g et h e r i n t i me . I t i s me a n in g fu l t o u s e T CP flo w to c at eg o ri z e IP p a ck et s b e c au s e h av i n g th e s a m e so u r c e a n d d es tin a tio n I P ad d r es s a n d p o r t n u mb e r p ai r s u n i q u e l y i d e n t i f i e s e a c h T CP f l o w. We n e e d a t o o l f o r o b s e r v i n g a TC P fl o w in wo rk in g n et wo rk s an d d e al w ith lo st an d d u p li c at ed p ac k e t s . S in c e th e r e al traffi c do es n o t fo llo w th e wel l kno wn standard o f t h e T CP p r o to co l , t h i s th e si s d e fin e s and cl a s si f ie s flo w s by ext endi n g t h e p r o to c o l m a c h i n e . We h av e i m p ro v e d t h e o r i g i n a l TC P p ro to co l ma c h i n e to b e u s ed p r a ct i c al ly in w o rk in g n et wo rk s . It sh o u ld b e n o t ed h e r e t h a t t h e t i me p ar a me t e r i s c r i t i c al i n s ep ar at i n g f l o ws . S p e c i al c a r e i s n e c e s sa r y w h e n t h e m e t h o d i s a p p li e d t o M o b i l e I P n e t w o r k s. T h i s i s d u e t o t h e a d d it i o n a l d e l a y t i m e t h a t n e e d e d f o r b i n d i n g u p d a t e i n M o b i l e I P. I t s h o u l d a l s o d e a l w i t h t h e d u a l I P ad d r es s e s wh i ch a r e u s ed b y a h o st wh il e mo v i n g . Th i s p ap er in v e s tig at e s th e t i me i s s u e s i n M o b i l e I P a n d p ro p o s e s a n d i mp l e me n t s a n e w t e chn iqu e th at r edu c es the binding update time, which is useful to solve the problem.
Ch ap t e r 2 ex p l a in s th e t r an s p o rt co n tr o l p ro to c o l (T CP ). I t d e s c ri b e s i mp o r tan t me c h an i s m s u s ed b y TC P f o r flo w co n tro l a n d co n g e st io n av o id an c e. TC P i s th e do min ant proto co l i n th e In t er n et . Mo s t of th e Int e rn et t r affi c ar e c ar ri ed by th e TCP. I t i s o r i g i n a l l y d e f i n ed i n R F C 7 9 3 , a n d e n h a n c ed in su b s eq u en t RF C s su c h a s R FC8 9 6 , R F C 2 5 8 1 a n d R F C 2 7 5 7 . T h e o p e r a t i o n o f T C P i s m o d el e d a s a f i n it e s t a t e m a c h i n e . Sev e r al s t at e s an d ev ent s ar e ex pl ai ne d in d et ai l. Co ng es ti on cont rol i s on e o f t h e i mp o r t an t is su e s in th e I n t er n et . W h i l e T CP i s e n h a n c e d w i t h c o n g e s t i o n a v o id a n c e me c h an i s m a n d f ai r q u eu in g i s i mp l e me n t e d in r out e r s, th e r e ex is ts t r a ff i c t h a t d o e s n o t u s e e n d - t o - e n d c o n g e s t i o n c o n tr o l a n d c a u s e s c o n g e st io n in t h e n et wo r k . Th o s e t ra ff ic s ar e c ar r i e d b y mu l t i c a st , o r u n i c a s t t r a ff i c su ch a s str e a mi n g mu lt i me d ia th at d o e s not n eed reli abili ty. A no the r i mpo rta n t is s u e wh i ch i s a d d r e s s e d i n t h i s c h a p t e r i s i n t ru s i o n at t a ck s . E v en i f T C P i s a ro b u s t p ro to co l, it i s s t i l l p r o n e t o a t t a c k s f r o m o u t s i d e . T h e r e h av e b e en two k in d s o f in t ru sio n d et e ct io n sy st e ms : o n e i s b a s ed o n th e mi su s e d et e ct i o n and th e o t he r i s b a s ed o n ano ma l i e s d et e ct ion. Th e mi s u s e d et e c tion c h e ck s th e pa ck et s ag ain s t kn own p at t er n or s ign at ur es . Th e ano ma ly b a s ed a ppro a ch d et e ct s th e at t a c k s by
1
o b s e r v i n g t h e a n o ma l y b e h a v io r s t h a t de vi at e fro m t h e no r ma l a ct ivi ti e s of th e sy st e m o r t h e u s e r s . T h e p ro b l e m w i t h t h e e x i st i n g i n t r u s i o n d et e c t i o n s y s t e m s i s t h a t t h e y e i t h e r g en e ra t e l o ts o f f al s e p o si tiv e al ar ms , o r f al s e n e g at iv e s ( m is s to d et e ct t h e at t a ck s) . On e o f t h e r e a so n s f o r f al s e p o s i t i v e a l a r m s i s s i m p l i f i e d p at t er n c h e c k in g . I t i s p e rfo r me d b y an I DS ( In t ru sio n D et e ct io n S y s t e m ) w h i c h o b s e r v e s s o m e s p e c i f i c s i g n a t u re s s u c h a s “ D E B U G ” o r “ W I Z A R D ” t h a t a r e f o u n d i n t h e b o d y o f a m a i l . I n ad d i t i o n t o t h e f ac t , at t a ck e r s may u s e a n e w t r i c k q u i c k l y s o t h a t t h e y c a n b y p a s s t h e detection sy stem.
Chapter 3 proposes and implements a new TCP protocol machine. It realizes a mu ltipurpose tool that can be used in working networks for flow analy sis, congestion monitoring and intrusion detection. It is developed based on the finite state machine (automaton). It is well known that RFC 793 defines a state diagram for TCP. However, the model doesn’t conform to the real TCP behavior. It also does not match to a
mathematical model of an automaton or a sequential machine. Some of the functions of TCP such as ABORT are missed in the original model. The major drawback is that mo st of the arcs include two events simu ltaneously. An automaton can handle one event at a time. It also functions sequentially. For example, it first receive a SYN, and then it responds with SYN-ACK. We modified and extended the original model to conform to the TCP behavior and also match with the finite state machine. The new machine accepts one event at a time and moves to the next state. In our extended machine, rcv-SYN and snd-SYN-ACK would be two independents tran sitions. The new protocol machine is a non-deterministic automaton that replaces invisible events, such as passive OPEN, and ABORT by empty sy mbol (ε). We use tcpdump to capture the real packets at the network gateway segment. The captured packets are grouped by the protocol machine into two categories called Valid flows and Invalid flows defined as follow:
1- Val id f l ow i s a f l ow th a t s t ar t wit h a SY N f l ag and end s wi th a FI N or a RS T fl ag.
Valid flows are further grouped into accepted and not-accepted flows.
A c c e p t e d f l o w i s a s e q u e n c e o f pack ets th at can b e mapp ed to th e set of input sy mbols Σ of the automaton and accepted by the protocol machine.
N o t - a c c e p t e d f l o w i s a s e q u e n c e o f p ac k e t s t h a t c a n b e m a p p e d t o t h e s e t o f i n p u t sy mbols Σ of the automaton, however they are not accepted by the machine.
2- Inv a lid flo w i s a f low th at st ar t s w ith a S Y N f la g and i t do e s not end s wi th a FI N or a RST f l ag. Thi s o c cu r s wh en pa ck et s a r e d ropp ed o r los t du e to a c ong es ti on in a l in k , o r d u e to ab n o r ma l p a c k et s s en t b y th e i n t ru d e rs . An i n v a lid fl o w i s a s eq u en c e o f p a ck et s t h a t c a n n o t b e m a p p e d t o t h e s e t o f i n p u t s y m b o l s o f t h e a u t o m a t o n . T h e y a r e n o t a c c ep t ed b y t h e p r o t o co l ma c h i n e. T h e i n t r u sio n d e t e ct i o n sy s t e m ( I DS ) i s a n e w p r o c e ss at t a ch ed t o the pro t oco l ma ch i n e. Thi s p r ogr a m r e a d s th e Inv al id F lo ws and ch e ck s th e m ag ain s t th e kno wn pat t ern s s u ch ho s t s c an, po rt sc a n , D N S a tt a ck, SY N at t ac k , an d
D eni a l of Se rvi c e (D OS ) att a ck et c. Th e t h e si s sho w s th a t th e ra tio o f th e inv a lid f low s to the total flows can be used as an index of the link congestion.
C h a p t e r 4 d e s c r i b es t h e M o b i l e I P t e c h n o l o g y a n d t h e e x i s t i n g p r o b l e m s . M o b i l e I P i s a p r o mi s i n g t e c h n o l o g y a n d a h i g h l y f e a s i b l e me c h an i s m t h a t all o w s a mo b il e n o d e t o mo v e a mo n g n e tw o r k s an d a t th e s a me t i me k e ep s i ts co n n e ct io n s a n d b e r e a ch a b le to the r es t of the Int e rn et . I P mob ili ty al low s p a ck et s t o be se nt t o the h o m e addr e s s whi ch is fo rw a rd ed to th e mob il e n o d e. It a l so hid es a n y add r es s c h ang e s fro m th e t r an spo rt an d ap p l i c at i o n l ay e r s . I n M o b i l e I P, e a c h mo b i l e h o s t i s i d e n t i f i ed w i t h a s t at i c h o me add r es s r e ga rdl e s s o f th e cur r e nt po int in th e Int e rn et . T h e ho me ad dr e s s is sto r ed by the ho me ag en t at th e h o me l ink. Wh en a mobil e nod e i s lo cat e d at a fo r eig n link , i t is add r es s ab l e by a ca r e of addr e s s (C oA ), in ad d i t i o n t o t h e h o me ad d r e s s. T h e c a r e o f a d d r e s s p r o v i d e s i n f o r m a t i o n a b o u t t h e c u r r e n t l o c a t i o n o f t h e m o b il e h o s t , a n d s h o u l d be r egi s t er ed w ith th e h o me ag ent wh en i t i s ch a n g e d. Th e ma pping or a s so ci at ion of th e c ar e o f a d d r es s a n d t h e h o me ad d r e s s i s cal l e d a b i n d i n g . We e x t e n d t h e p r o to co l m a c h i n e w i t h a p r e - p r o c e s s o r t o d e a l w i t h t h e I P t u n n e l i n g a n d t o r e s o l v e t h e d u al I P address as well.
C h a p t e r 5 p r o p o s e s a n d i m p l e m e n t s a n e w b i n d i n g u p d a t e m e t h o d i n M o b i l e I P v 6 . I n the cur r en t mob il e I P st and a rd , th e ho me ag en t s of a mo bi le ho st ar e lo c at ed in a c en t r al i ze d l o c at i o n , c al l ed t h e h o m e l i n k . I f a m o b il e n o d e m o v e s t o a f o r e i g n l i n k f a r fro m th e h o me lin k , it wo u ld c au s e d el ay in th e b in d in g u p d a t e ( B U ). Bi n d in g Up d at e i s v a l i d f o r t h e l i f e t i m e a n d s h o u l d b e r e f r e s h ed b e f o r e t h e l i f e t i m e e x p i r e s . T h i s a l s o gen e ra t e s ext r a unw ant ed t r aff ic in th e I n te rn et . T h is p ap er propo s e s a nd i mp l e m ent s a n e w t e c h n i q u e b y i n c r e a s i n g t h e n u m b e r o f h o me a g en t s an d d i s t r i b u t e s t h e m a c r o s s t h e Int e rn et. Thi s me t h od he lp s a mob il e hos t to fi nd th e cl os e st ho me ag ent f o r f a st r e g i s t r a t i o n a n d b i n d i n g u p d a t e . I t s h o rt e n s t h e bin d ing upd a t e t i me and al so r edu c e s th e traffic in the Internet.
Ch ap t e r 6 co n cl u d es th i s th e si s . Th i s t h es is p ro p o se s to an a ly z e n e tw o r k t ra ff ic s n o t o n t h e p a c k e t - b y -p a c k e t b as i s , b u t b y u s i n g g r o u p o f p a c k e t s c a l l e d a f l o w. I t u s e s a n d ex t en d s a TC P p ro to co l ma c h i n e wh i ch is a mu lt ip u r p o s e to o l. Th e n e w p ro to co l ma ch i n e c an b e ap p li ed to wo rk in g n et w o rk s fo r p ra c ti c al fl o w a n aly si s, in tru s io n d et e ct io n an d c o n g e s t i o n mo n i t o r i n g . T h i s t h e si s a ls o sh o w s h o w to d e a l w i t h M o b i l e I P b y t h e proto co l ma chin e by tr an sl a ting I P add re s s es . I n o rd er t o r e d u c e t h e b in d i n g u p d a t e t i m e i n M o b i l e I P, t h i s t h e s i s p ro p o s e s a n e w b i n d i n g u p d a t e m e t h o d i n m o b i l e I P v 6 . I t h e l p s t h e m o b il e h o s t s f o r f a s t r e g i s t r a t i o n a n d b i n d i n g u p d a t e a n d a l s o r e d u c e s t r a ff i c i n t h e I n t e r n e t . T h i s t h e s i s p r o p o s e s a n e x t e n s i o n o f t h e o ri g i n a l p ro to c o l m a c h i n e w h i c h c a n be used for a variety of working networks.
3
研 究 業 績
種 類 別 題名、 発表・発行掲載誌名、 発表・発行年月、 連名者(申請者含む)
論文 論文 講演 講演 講演 講演
Heshmatollah KHOSRAVI, Hiroaki FUKUDA and Shigeki GOTO.
New Binding Update Method in Mobile IPv6, The International Conference on Information Networking (ICOIN), Feb 2004, p.1260-1269, LNCS 3090 pp. 267-276.
Heshmatollah KHOSRAVI, Masaki FUKUSHIMA and Shigeki GOTO.
An Improved TCP Protocol Machine for Flow Analysis and Network Monitoring", IEICE (The Institute of Electronics, Information and Communications Engineers) Transaction, Vol.E86-B No.2, Feb 2003 p.595-603.
Heshmatollah KHOSRAVI and Shigeki GOTO, Applying TCP Protocol Machine In Mobile IPv6, ITRC Seminar, Daejeon, Korea, 25-27 Nov 2004.
Heshmatollah KHOSRAVI and Shigeki GOTO, Distributed Home Agent in Mobile IPv6, Core University Seminar, Karuizawa, Japan, May 2004.
Yichi ARAI, Shuhei MIURA and Heshmatollah KHOSRAVI, Combining Access Grid with Mobile IP", Core University Seminar, Karuizawa, Japan, May 2004.
Heshmatollah KHOSRAVI and Shigeki GOTO, Applying a new TCP Protocol Machine for Network Monitoring", Technical Report of IEICE (The Institute of Electronics, Information and Communications Engineers), Vol.101 No.440, IA2002-34 (2002-10), October 2002, p.89-96.