• 検索結果がありません。

pdf Microarray基礎分析II

N/A
N/A
Protected

Academic year: 2018

シェア "pdf Microarray基礎分析II"

Copied!
22
0
0

読み込み中.... (全文を見る)

全文

(1)

在家也能作的Microarray分析

•  Desktop or laptop PC/Linux/Mac

•  Free software

•  Free Array Data

講者: 董建億 博士

VYMGC Microarray Core lab

(2)

Workflow

•  Data preprocessing

–  Normalization

–  Probe summarization

–  Signal adjustment (flag, flooring)

–  Signal transform (fold change, log transform)

•  Data QC/QA

–  QC index (Hybridization control)

–  Similarity analysis (Clustering, PCA, MDS)

•  Find differentially expressed genes

–  Statistical analysis –  Set Cut-off

•  Biological interpretation

–  Gene annotation –  Gene set analysis –  Network analysis

(3)

System requirement

•  Hardware:

–  CPU: at least P4

–  4Gb RAM in 64-bite OS is strongly recommended!

–  With great patience, 2 Gb in 32-bit OS is OK.

•  Software:

–  R GUI environment

•  http://cran.csie.ntu.edu.tw/

–  MultiEcperimentViewer

•  http://sourceforge.net/projects/mev-tm4/files/latest/download

(4)
(5)

MultiExperiment Viewer

MeV is a desktop application for the analysis, visualization and data-

mining of large-scale genomic data. (http://www.tm4.org/mev/)

(6)

The R Project for Statistical Computing

R is a free software environment for statistical computing and

graphics. It compiles and runs on a wide variety of UNIX

platforms, Windows and MacOS.(http://www.r-project.org/)

(7)

R language

(8)

Installation

•  Download MeV(4.8) and Unzip in your Hard Drive

•  Install JAVA 3D API(32-bit) for PCA 3D plot

•  Download EazyKit.zip and unzip into R folder of MeV

•  Run RunMeFirst.bat

•  Run RGUI

•  library(lucasLazyPack)

•  lz.GUI()

(9)

Data source

•  NCBI GEO (http://www.ncbi.nlm.nih.gov/geo/)

(10)

Demo samples

GSE12211

http://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi?

acc=GSE12211

Normalized Data  Matrix 

Raw Data 

(11)

Workflows

.CEL files (Affymetrix)

.Txt files (Agilent, Illuminia, gpr)

Probe signal table 

Data QC/QA  (Box Plot, MDS, PCA) 

Sta?s?cal analysis  (limma, ANOVA)  Raw data (.cel) 

  Normaliza>on (RMA)    Hybridiza>on QC report 

Raw data  

  Normaliza?on  

  Hybridiza?on QC report   NCBI & 

ArrayExpress 

By MeV 

By R Script 

Clustering  (HCL, SOM..) 

(12)

lucasLazyPack

Download:

https://sites.google.com/site/lucastproject/tools/lucaslazypack Install from EasyKit folder

•  Double click RunMeFirst.bat

Install from RGUI:

•  Install.packages(file.choose()), select download zip file. Usage:

In EasyKit folder, click RunGUI.bat

•  library(lucasLazyPack) 

•  lz.GUI()

(13)

Data preprocess

•  Normalization

1.  Load CEL files

2.  Select normalization algorithms 3.  Show QC report

4.  Save result

•  Signal Flooring

1.  Load data file

2.  Select flooring level 3.  Save result

•  Log 2 transfrom

•  Sample Reorder

(14)

QC report

(15)

Dimensional reduction

•  DimR.R

1.  Load Data matrix

2.  Assign data point

3.  Show MDS plot

•  PCA (MeV)

(16)

MeV

(17)

Two class analysis

•  Create RMA data files(By LazyPack)

•  Load Data

–  [File]>[Load Data]>[Select File loader]>[Affymetrix] >[RMA File] –  [Adjust Data]>[Gene\Row Adjustment]>[Mean center gene/rows]

•  Assign sample group

–  [Display]>[Sample/Column Label]>[Edit label/Reorder Samples]>[Edit] –  [Cluster Manager]>[Sample Clusters]>[Auto-Cluster by Factor]

•  Statistics Analysis

–  T-test

•  2 classes

•  Paired samples

–  Limma

•  2 classes

–  Multiple test correction

•  Clustering (HCL)

(18)

Multiple Class analysis

•  Create RMA data files(By LazyPack)

•  Load Data

–  [File]>[Load Data]>[Select File loader]>[Affymetrix] >[RMA File] –  [Adjust Data]>[Gene\Row Adjustment]>[Mean center gene/rows]

•  Assign sample group

–  [Display]>[Sample/Column Label]>[Edit label/Reorder Samples]>[Edit] –  [Cluster Manager]>[Sample Clusters]>[Auto-Cluster by Factor]

•  Statistics Analysis

–  ANOVA –  Limma

–  Multiple test correction

•  Clustering

–  Hierarchical Clustering –  Self-Organization Map –  K-mean/median

(19)

Find differentially expressed genes

•  Statistics

–  Two samples (ratio) –  One class sample –  Two sample pools

•  t-test(Welch’s t test)

•  limma

–  >2 group (1-way ANOVA) –  Pair comparison (pair t-test)

–  Multi-factor analysis (n-way ANOVA) –  Time course

•  Select D.E. genes

–  Multiple test correction

•  p-value correction

–  Filtration

•  Fold changes-cut-off

•  p-Value cut-off

–  Volcano plot

1 2 3 4 5

(20)

T-test

Equal sample sizes, equal variance 

Unequal sample sizes, equal variance 

Unequal sample sizes, unequal variance 

(21)

Analysis of Variance (ANOVA)

(22)

Volcano plot

•  X axis: fold change

•  Y axis: confidence level (p-value)

参照

関連したドキュメント

In Section 7, we will provide a method for computing the free divisibility indicator of a symmetric measure and show that ultraspherical distributions and t-distributions mostly

Erd˝os (see [2]) first tackled the problem of determining the minimal cardinality of Σ(S) for squarefree zero-sum free sequences (that is for zero- sum free subsets of G), see [7]

As application of our coarea inequality we answer this question in the case of real valued Lipschitz maps on the Heisenberg group (Theorem 3.11), considering the Q − 1

Let C be a co-accessible category with weak limits, then the objects of the free 1 -exact completion of C are exactly the weakly representable functors from C

Thus, while the ergodiclty corresponds to the states of statistical equilibria over the various phase-cells (non- nullatoms of t at the initial time t 0, the mixing of phases

A connection with partially asymmetric exclusion process (PASEP) Type B Permutation tableaux defined by Lam and Williams.. 4

(4S) Package ID Vendor ID and packing list number (K) Transit ID Customer's purchase order number (P) Customer Prod ID Customer Part Number. (1P)

HM-0335A is a postemergence herbicide for controlling a wide spectrum of annual, biennial, and perennial broadleaf weeds and brush in pastures, rangeland, and grass (hay,