Figure 1‑(8): A series of smoothed cenlrjpcLal profiles of TIM prOLcln Is shown. The horlzontal flnd the vcrLlcal axes show
the residue number and the centrlpctal index F. respectively. Thc dcflnltion of F Is ln the text. The arrows lndlc8te thc local mlnlma of centr!PCLal proflles. Thc local mlnlma of these profiles represent the results procured from the dlsLancc map mcLhod 1n Figure 2 Lo within a few resldue differenccs. The candldates refcr Lo the condltlons of the extenslon profl1es of
the protein.
Flgure l‑(b):
^
series of extenslon proflles of',
'1MproLeln are shown. Thc horizonLal and the vertlcal axes show the resldue number of 1..11C proteln and the extenslon index E.
respcctivcly. The defLn1tlon of E ls explained br1efly 1n the tcxt. The arrows 1nd1cate the pos1t1on of typical local max1ma of thc profl1es, whlch are 1n accord wlth the )ocal mJnlma of lhe ccntr1petal prof11es (a). The compaclness of each local scgmcnt 1s confirmed by these profiles
Figure 2:
^
distance map of chlcken triosephosphatc lsomcrasc (TIM) 15 demonstrated accordlng to the X‑raycrystallographlc data set. 'rh1s data set 1s recordcd ln thc
日
rookhavenNaUonal Laboratory data bank. 80th the horlzontal and the vertical axes show the residuc nurnbers of protc1n.^
polnt(l,j) within the triangle shows thc dlstance belwccn thc l‑lh and thc j‑th reslducs. The pairs which arc further than 27
^
distantfrom each other arc shown in black, thc pairs which are closer than 8
^
are enclosed, and the othcrs are left ln whlte. Each ofLhe solld lLnes on the map starts rrom a module boundary and moves in both vertlcal and horlzonLal dircctlons. Thc arrows
indlcate the posltlons of introns obscrvcd Jn thc ch1cken gene Thc inLron poslLions are near Lhe modulc boundarles to wlthin a
rcw rcsidue diffcrcnces. as in Lhc case of hemoglobin (G6. 198t) TIM proteln Is composed of aL lcasL 14 modulcs. 'rhe rlrst
boundary Is at the ll‑th reslduc. whlch ldent1rLes a rc]atively short module 1.n Lhe N Lerminus. lIowever. there Is an lntron at thls posltlon ln the TIM genes of other spec1es. Although nonc of thc dlstances between the resldues 1n the flr乞h module and Lhosc 1n the sixth module 1s more than 27 A. some or these dlsLanccs are more than than 25 A. and thcsc Lwo modules arc Cound to be
lso1ated from each other when seen the dlstance relat10 r1bctwecn the residues 1n thcsc two modules and the residucs In the nlnth module. The resldues wlthln each module are closc Lo onc anothcr
1n three dlmcnsLonal space, Lndlcatlng Lhe compacLness of modules (Go, 1981)
F1gure 3: Ccntrlpetal prori]es of porcine AdenylaLe klnase are demonstratcd. The horizontal and the vertlcal axes show the rcsldue number and Lhe lndex or F. respectively, 'I'he arrows lndlcate the ldcntlflcd module boundarJes, and the two whlte arrow heads wlth dashed lines lllusLrate possjblc addltlonal boundaries. There are aじ leasL14. and possibly, 16 modules
Fょgure 4: Thc allgnment of tcn Jsozymes sequcnccs ls reprcsented, ln whlch either a large deletion or a large
lnsertlon cx1sts on residue numbcr 132 0[' porclne AK. lIere, AK3
日
1AK21l, AKY, AKE, AKl F, AK1C, M 11l, AK1P, AK1B, and AK 111 are adenylate klnases ln: bovine m1tochondrla malrlx, bov1ne
mltochondrla lnter‑Incmbranc, ycast Cyt050], E. co] 1. carp musc]c白 chlcken musc1e, rabblt mU5clc, porclne mU5c1c. bovjne rcd ce115.
and human muscle. respectively. lntcresting1y. there are
dcletlons of more than four rcsidues 8t the two posslb1c modulc boundaries (102 and 138), suggestjng that the5c Lwo positions wcre supposed to havc becn modu]e boundarles
Figure 5: Thc module organ1zation of porcjne adenylate kinase 1s illustrated, 1n whlch the position (at resldue number 132) and the slzc of lhe inserted or deleted 5egment (26
resldues) Is shown
Figure 6: Stlck and ball models of porcine adcnylate klnasc drawn from the ONL coordinate dat8 5cl are dcmonstrated ln stereo vlcws from dlfferent dlrectlons. Each arrow 1n (a) and (b)
1ndicales thc posltlon of a large alteration whlch Is near the top of the wa 11 f'ormlng a large cleavage. The 26 reslduc segmen t. whlch Is lncludcd only in the long Isozymes, covers 8 part of
th15 cleavage ln AKY (Schulz, et.al.. 1986)
Flgure 7: The phylogenetic trce of these adcnylate kinases ls shown. AK3B. AK2
自.
AKY. AKE. AKJF. AKIC. AKIR. AKIP. AKIs. and AKll1 are thc sarnc as explaincd ln Flgure 4. The short typcenzymes makc a clustcr on this trce and are supposed to have dlverged from thc long isozymes by gene duplicatlon ln early stages of thls proteln evolution. DJvergent polnt 1 shows the gene duplicatlon and lt
,
as wcll asもhe other polnts wlth numbcrs(2. 3 and 4), Is a possible djvergcnt point of prokaryotes and eukaryotes. Thc short type adenylate klnases are 1n accord with
the species dlvergence, whereas the long type enzymcs havc some
cornp1exlty
Figure 8: The frequency of Lhc absolute diffcrcnccs ln Lhe common results of the origlna1 and of the reflncd mcthods Is shown. The horizonta1 axis shows thc dlfference expressed in thc number of resldues and Lhe vcrLlcal axls shows the numbcr of dlfferences. Thc new resu1Ls reprcsent 89 percent of thc old
results withjn an error range of ~ 2 resldues and thc average d1fference of thesc correspondlng 143 boundarles Is 1.2 residucs Thls degree of accuracy suggesLs tl1e sultability of represcnting
the old results by the refined mcthod
Flgure 9: The distrjbutlon of module slze Js dcmonstratcd Thc horizontal and the vertlcal axes rcpresent the module Jength and thc frequency of each length. respectively. IIcre. (a) Is thc
total dlstributlon of idcntifled modu1es. (b) Is thc tota1 dlstributjon of Jntcrnal modulcs which do not contain N and C
tcrminaJ. modulcs. (c) 1s thc d1S,lrlbutJon of modules frorn
cukaryotes. (d) 1s thc distribulion of modules from prokaryotes.
and (e) Is the distrlbutlon of modules from v1rus and phage.
respectivelY. Since thls lmprovcd method detects addit10na1 module boundarJcs. the size dJstributJon of modules Is smal1er
than ln the prevJous study (Cδand N05aka. 1987). AIJ of the5c d15trlbutlon patterns are simllar to one anothcr
Flgure 10: The relatlon of a proteln modulc ・s average 51 ze to the protein's tota1 1ength ls 5hown. 'rhe horJzonLal and the vcrtlcal axes show the total pcptlde length and Lhc avcragc module slze of Lhe proLcJn. rcspectJvely. soLh are expressed 1n number of residues. Although there Js no slgniflcant correlatlon
beLween them, thls docs suggesL Lhat modu]es arc relatlvcly unlform as to thclr slze and are 1ndcpcndent of thc protcln's 1cngth. In smallcr proteins of less than about 100 rcsldues the average module slze varles from 10 to 22 resJdues. whl1c ln
larger protelns the average module slze 15 w1thln thc more narrow range of from 14 to 18 resldues. Slncc the results lndlcate that sOlne protelns wlth ]arger modu]cs
,
whlch arc marked wJth a clrclcLn Flg. 10, arc cxtremely hellx‑rlch, thc correlatlons betwecn the average module lenglh of protclns and the sccondary structurc contents ln the protclns are studlcd 1n 5ectlon 3
Flgure 11‑1: l'he slze dlsLrlbutlon of 1056 exons from 2]0 gcnes Is shown. The horjzonLal ax15 demonstrate5 the exon lcngth ln amlno acjd numbers. and thc vertlca] axis shows the frcquency of each cxon lengLh. Only thc pcptlde codlng exons are compiled
,
and the N‑ and C‑termlnal exons whlch have untranslatcd parts arc not accumulatcd. Exons whlch arc longer than 600 nucleotJdc
lcngth arc not shown because such exons are smal] jn number and cx1st d1spersivcly. It 1s worthwhlle Lo notlce
ttlat the slzc dLstrlbuLlon of the exons Is ln accord w1th lhe slzc dlstrlbutlon of thc modules. The 5mall exon part of Lhls
dJstrlbutlon Is slml1ar to the dlstr1buL1on of module5 and Lhe larger exon part can be regarded as the combLnaLLon of Lhe 5cvcral disLrlbuLions of the connecLed modules
Flgure 11‑2: Maln part of Lhe slze dlstrlbuLlon of cxons (a) and the modcl dJstrlbutions derived from the si.zc d1sLrlbuLlon of modules are shown. The horlzontaJ and the vertJcal axes are the same as in Flg.11‑1. IIcrc. (b) .15 Lhe bcsL flL combLnatJon of
five distributions of segments wh.ich arc convo
l :
utcd from one to fJve modules and (c) 1s the dLstrlbut10n of thc convoluLedsegments usjng a random numbers of modu]es.
^
comparJson between (a) and (c) suggests that modules djd not randomly accumulate to make an exon and lntrons did not randomly delctc. The mostcffcctlvc dlstrlbution ls that of threc modules. Thls number 1s colnc1de to the ratlo obtained from the comparlson bctwccn the
Intron posltlons and the conflrmed module boundarlcs
Flgures 12: 1'he correlations between module boundarlcs and the secondary structures of 24 protelns are expressed. The
horlzontal and the vcrtlcal axes show thc sccondary structure ratlos of the proteins and the corrcsponding secondary structure ratlos only on the module boundarlcs ln the proteins, rcspcct1vely rhe hellx ratios are shown in (a)守 Lhcβ‑struCtureratJos are shown ln (b). thc turn ratios are shown ln (c). and the random col1 ratlos are shown 1n (d). LJnes with slope 1 lndlcate the standard sltuatlon whcre thc module boundar1es has no preferencc
for the secondary struc乞ure
Figures 13: The correlatlons between the averagc slze of one proteln modules and the secondary structure ratlos of the proteln
llsted in Table 5 are demonstrated, 'I'he川odule boundarjes of
these 85 protcJns are Jdentlfled by further 1mprovcd method. Each of the horizontal axes shows the average modulc slzc of a protejn and each of the vertlcal axis shows the hellx ratlo of the
protein (a). the βstructure ratlo of thc proteln (b), the turn ratlo of the proteln (c). and the random col1 ratLo of thc protcLn (d)
,
respectivcly. The average slzc of one protcln modulescorrclates posltlvely to the hellx ratlo of thc proteln. This
correlation funcL10n is 0.56. whlch is entircly slgnlCLcant fOI al1 85 protel ns. lIowever. Lhe avcragc s1ze of onc protcJn modulcs shows only weak negaLive correlaLlons to thc ratlos of turns and random c0115 and shows no correlatlon Lo the raL10 ofβstrand
Figure 14‑(a): 'rhe frequency of Lhc devLaLions beLween Lhc 125 lnLron poslLions and the corrcsponding modulc boundaries Jn 24 proteins 1s demonstrated. The horJzontal axls shows Lhe dcvlatlon between each inLron and Lhe nearest module boundary The vertical axJs shows the number of deviatlons. Introns whjch
lnterrupt codons after the f1rst and the second bascs are p]oLted together wlLh Lntrons which intcrrupt codons aftcr thc thJrd
bases. The domlnant devlatlon 1s 1 residue. and the average of the deviatlons is 2.8 resldues
Flgure 14ー(b): The probablc frcquency of cach devLaLion Is Illustrated whcn 125 1ntrons 8rc lnserted into genes
Jndcpcndently of module boundarJes. 'rhe horlzontal and the verLical axcs 8rc the same as for Flgure 14‑(a) γo calculate
thls probabillty from simulatlon. the slze d!strJbutlon of
modulcs and tYPlC81 lntron poslLIons on a gene are employed. An X 2‑test of these f"lequency dJstrlbutlons. (a) and (b). conflrmcd Lhe correlation between lntron poslLions and module boundarJes
(
N