Conditional Graphical Models for Protein Structure Prediction

61
Carnegie Mellon School of Computer Science 1 Conditional Graphical Models for Protein Structure Prediction Yan Liu Language Technologies Institute School of Computer Science Carnegie Mellon University Oct 24, 2006

description

Conditional Graphical Models for Protein Structure Prediction. Yan Liu Language Technologies Institute School of Computer Science Carnegie Mellon University Oct 24, 2006. Nobelprize.org. DSCTFTTAAAAKAGKAKAG. Protein sequence. +. Protein function. Protein structure. - PowerPoint PPT Presentation

Transcript of Conditional Graphical Models for Protein Structure Prediction

Page 1: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

1

Conditional Graphical Models for Protein Structure Prediction

Yan Liu

Language Technologies InstituteSchool of Computer ScienceCarnegie Mellon University

Oct 24, 2006

Page 2: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

2

Snapshot of Cell BiologyNobelprize.org

+

Protein function

DSCTFTTAAAAKAGKAKAG

Protein sequence

Protein structure

Page 3: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

3

Protein Structures and Functions

Courtesy of

Nobelprize.org

Adenovirus Fibre Shaft Virus Capsid

Example: triple beta-spiral fold

Page 4: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

4

Protein Structure Determination

• Lab experiments: time and labor- consuming

X-ray crystallography Nobel Prize, Kendrew & Perutz, 1962

NMR spectroscopyNobel Prize, Kurt Wuthrich, 2002

• The gap between sequence and structure necessitates computational methods of protein structure determination 3,023,461 sequences v.s. 36,247 resolved structures

(1.2%)

1MBN

1BUS

Page 5: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

5

Protein Structure HierarchyWe focus on predicting the topology of the structures from sequences

APAFSVSPASGACGPECA

Page 6: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

6

Major Challenges

• Protein structures are non-linear Long-range dependencies

• Structural similarity often does not indicate sequence similarity Sequence alignment reaches

twilight zone (under 25% similarity)Ubiquitin (blue) Ubx-Faf1 (gold)

β-α-β motif

Page 7: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

7

Previous Work• Sequence similarity perspective

Sequence similarity searches, e.g. PSI-BLAST [Altschul et al, 1997]

Profile HMM, .e.g. HMMER [Durbin et al, 1998] and SAM [Karplus et al, 1998]

Window-based methods, e.g. PSI_pred [Jones, 2001]

• Physical forces perspective Homology modeling or threading, e.g. Threader [Jones, 1998]

• Structural biology perspective Methods of careful design for specific structures, e.g. αα- and ββ- hairpins, β-

turn and β-helix [Efimov, 1991; Wilmot and Thornton, 1990; Bradley at al, 2001]Generative models based on physical free-energy

Hard to generalize due to the various informative features

Fail to capture the structure properties

Page 8: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

8

Structured Prediction

• Many prediction tasks involve outputs with correlations or constraints

John ate the cat .

SEQUENCEXS…WGIKQLQAR

TreeSequence GridStructure

Input

Output

• Fundamental importance in many areas• Potential for significant theoretical and practical advances

HHHCCCEEE…EECCCCEEE

Page 9: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

9

Graphical Models

• A graphical model is a graph representation of probability dependencies [Pearl 1993; Jordan 1999]

Node: random variables Edges: dependency relations

• Directed graphical model (Bayesian networks)

• Undirected graphical model (Markov random fields)

11..

( ,.., ) ( | parents( ))n i ii n

P x x P x x

1

1 1( ,.., ) ( ) exp( ( ))n c c c

c C c C

P x x x H xZ Z

Page 10: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

10

Conditional Random Fields• Hidden Markov model (HMM) [Rabiner, 1989]

• Conditional random fields (CRFs) [Lafferty et al, 2001]

Model conditional probability directly Allow arbitrary dependencies in observation Adaptive to different loss functions and

regularizers Promising results in multiple applications

11

( ) ( | ) ( | )N

i i i ii

P P x y P y y

x, y

11 10

1( ) exp( ( , , , ))

N K

k k i ii k

P f i y yZ

y | x x

1 1( , , , ) ' ( , ) ( , ')k i i k i if i y y f i I y s y s x x

Page 11: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

11

Protein Structure Prediction

• Dependency between residues (single observation)

• Dependency between components (subsequences of observations)

Page 12: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

12

Outline• Brief introduction to protein structures• Graphical models for structured-prediction

• Conditional graphical models for protein structure prediction General framework Specific models

• Experiment results• Conclusion and discussion

Page 13: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

13

• Outputs Y = {M, {Wi} }, where Wi = {pi, qi, si}

• Feature definition Node feature

Local interaction feature

Long-range interaction feature

Our Solution: Conditional Graphical Models

1 1 1( , , ) ( , ', 1)k i i i i i if w w x I s s s s p q

( , ) '( , , ) ( ', 1 ')k i k i i i i if w x f x p q I s s q p d

Long-range dependencyLocal dependency

1( , , ) '( , , , , ) ( , ')k i j k i i j j i if w w x g x p q p q I s s s s

Page 14: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

14

• Conditional probability given observed sequences x is defined as

• Prediction:

• Training phase : learn the model parameters λ Minimizing regularized negative log loss

Iterative search algorithms by seeking the direction whose empirical values agree with the expectation

Conditional Graphical Models (II)

1

1( | ) exp( ( , ))

G

K

k k cc C k

fP Y YZ

x x

( | )( ( , ) [ ( , )]) ( ) 0G

k c p k cc Ck

Lf E f

y xx y x y

21

( , ) log ( )G

K

k k cc C k

L f Z

x y

1

* arg max ( , )G

K

k k cc C k

y f Y

x

Page 15: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

15

Major Components

• Graph topology Secondary structure prediction: CRF, kernel CRF Tertiary fold recognition: Segmentation CRF, Chain graph

model Quaternary fold recognition: Linked segmentation CRF

• Efficient inference Prefer exact inference with O(nd) complexity Resort to approximate inference

• Features Allows flexible and rich feature definition

Page 16: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

16

Protein Secondary Structure Prediction

• Given a protein sequence, predict its secondary structure assignments Three classes: helix (H), sheets (E) and coil (C)

Input: APAFSVSPASGACGPECA

Output: CCEEEEECCCCCHHHCCC

Page 17: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

17

CRF on Secondary Structure Prediction [Liu et al, Bioinformatics 2004]

• Node semantics – secondary structure assignment• Graphical model - conditional random fields (CRFs) or kernel CRF

• Inference algorithm - efficient inferences exists, such as forward-backward or Viterbi algorithm

C C E E …. ... C

11 1

1( | ) exp( ( , , , ))

N K

k k i ii k

P f i y yZ

y x x

Page 18: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

18

Protein Fold Recognition and Alignment

• Protein fold: identifiable regular arrangement of secondary structural elements

• Different from previous simple fold classification

• Provide important information and novel biological insights

Input: ..APAFSVSPASGACGPECA..

Output 1: Does the target fold exist?

Output 2: ..NNEEEEECCCCCHHHCCC..

Yes

Testing PhaseTraining Phase

Page 19: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

19

Conditional Graphical Model for Fixed Template Fold

[Liu et al, RECOMB 2005]

• Node semantics - secondary structure elements of variable lengths

• Graphical model - segmentation conditional random fields (SCRFs)

• Inference - forward-backward and Viterbi-like algorithm can be derived given some assumptions

β-α-β motif

1

1( | ) exp( ( , ))

G

K

k k cc C k

P y f yZ

x x

Page 20: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

20

Conditional Graphical Model for Repetitive Fold Recognition

[Liu et al, ICML 2005]

• Node semantics - two layer segmentation Y = {M, {Ξi}, T} Level 1: envelop, or one repeat, level 2: components of one repeat

• Graphical model - Chain graph model A graph consisting of directed and undirected graphs

• Inference - forward-backward algorithm and Viterbi-like algorithm

1 11

( | ) ( ,{ }, | ) ( ) ( | , ) ( | , , )M

i i i i i ii

P Y P M T P M P T x P T

x x x( )

,1 ,

1 1

1( | , , ) exp( ( , , )

j

i j

M K

i i i k k i jj ki

P T f W WZ

x x

Page 21: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

21

Conditional Graphical Model for for Quaternary Fold Recognition

[Liu et al, IJCAI 2007]

• Node semantics – secondary structure elements and/or simple fold

• Graphical model - linked segmentation CRF (L-SCRF) Fix template and/or repetitive subunits Inter-chain and intra-chain interactions

, , ,

1 1 , , ,,

1( ,..., | ,..., ) exp( ( , )) exp( ( , , , ))

i j G i j a b G

R R k k i i j l k i a i j a bV k lE

P f g yZ

y y y

y y x x x y x x y

Page 22: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

22

Approximate Inference

• Varying dimensionality requires reversible jump MCMC sampling [Greens, 1995, Schmidler et al, 2001]

• Four types of Metropolis proposals State switching Position switching Segment split Segment merge

• Simulated annealing reversible jump MCMC [Andireu et al, 2000]

Replace the sample with RJ MCMC Theoretically converge on the global optimum

Page 23: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

23

Conditional Graphical Models for Protein Structure Prediction

Page 24: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

24

Model Roadmap

Conditional random fields

Segmentation CRFs

Chain graph model

Linked segmentation CRFs

Segment Correlations

Inter-chain Segment Correlations

Local and Global Tradeoff

Generalized as conditional graphical models

Kernel CRFsKernelization

Page 25: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

25

Outline• Brief introduction to protein structures• Graphical models for structured prediction

• Conditional graphical models for protein structure prediction

• Experiment results Fold recognition Fold alignment prediction Discovery of potential membership proteins

• Conclusion and discussion

Page 26: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

26

Experiments: Target Fold• Right-handed β-helix fold [Yoder et al, 1993]

Bacterial infection of plants, binding the O-antigen and so on

• Leucine-rich repeats (LLR) [Kobe & Deisenhofer, 1994]

Structural framework for protein-protein interaction

Page 27: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

27

Experiments: Target Quaternary Fold

• Triple beta-spirals [van Raaij et al. Nature 1999]

Virus fibers in adenovirus, reovirus and PRD1

• Double barrel trimer [Benson et al, 2004]

Coat protein of adenovirus, PRD1, STIV, PBCV

Page 28: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

28

Tertiary Fold Recognition: β-Helix fold

• Histogram and ranks for known β-helices against PDB-minus dataset

5

Chain graph model reduces the real running time of SCRFs model by around 50 times

Page 29: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

29

Quaternary Fold Recognition: Triple β-Spirals

• Histogram and ranks for known triple β-spirals against PDB-minus dataset

Page 30: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

30

Quaternary Fold Recognition: Double Barrel-Trimer

• Histogram and ranks for known double barrel-trimer against PDB-minus dataset

Page 31: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

31

Fold Alignment Prediction: β-Helix• Predicted alignment for known β -helices on cross-family

validation

Page 32: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

32

Fold Alignment Prediction:LLR and Triple β-Spirals

• Predicted alignment for known LLRs using chain graph model (left) and triple β-spirals using L-SCRFs

Page 33: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

33

Discovery of Potential β-helices

• Hypothesize potential β-helices from Uniprot reference databases Full list can be accessed at

www.cs.cmu.edu/~yanliu/SCRF.html

• Verification on proteins with later resolved structures from different organisms 1YP2: potato tuber ADP-glucose pyrophosphorylase

1PXZ: major allergen from Cedar Pollen

GP14 of Shigella bacteriophage as a β-helix protein

Page 34: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

34

Conclusion• Thesis Statement

Conditional graphical models are effective for protein structure prediction

• Strong claims Effective representation for protein structural properties Flexibility to incorporate different kinds of informative features Efficient inference algorithms for large-scale applications

• Weak claims Ability to handle long-range interactions Best performance bounded by prior knowledge

Page 35: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

35

Contribution and Limitation

• Contribution to machine learning Enrichment of graphical models Formulation to incorporate domain knowledge

• Contribution to computational biology Effective for protein structure prediction and fold

recognition Solutions for the long-range interactions (inter-chain and

intra-chain)

• Limitation Manual feature extraction Difficulty in verification High complexity

Page 36: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

36

Future Work• Computational biology

• Machine Learning

Protein structure prediction

+

Protein function and protein-protein interaction prediction

Drug target design

Graph topology learningGraph-based

semi-supervised learning

Active learning for structured data

Page 37: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

37

Acknowledgement

• Jaime Carbonell, Eric Xing, John Lafferty, Vanathi Gopalakrishnan

• Chris Langmead, Yiming Yang, Roni Rosenfeld, Peter Weigele , Jonathan King, Judith Klein-Seetharaman, , Ivet Bahar, James Conway and many more

• And fellow graduate students …

Page 38: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

38

Page 39: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

39

Features for Tertiary Fold Recognition

• Node features Regular expression template, HMM profiles Secondary structure prediction scores Segment length

• Inter-node features β-strand Side-chain alignment scores Preferences for parallel alignment scores Distance between adjacent B23 segments

• Features are general and easy to extend

Page 40: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

40

Features for Protein Fold Recognition

Page 41: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

41

Discovery of Potential Double Barrel-Trimer

• Potential proteins suggested in [Benson, 2005]

Page 42: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

42

Inference Algorithm for SCRF

• Backward-forward algorithm*

• Viterbi algorithm*

)),,(exp(),1(),(),(

1,

,,,,

ssxfypyqyrK

kkkryq

qppylryl ll

_

)),,(exp(),1(),(max),(1

,,,,

,

ssxfypyqyrK

kkkryqyl

qppryl ll

_

p(state yr ends at r |xl+1 xl+2… xr-1xr and state yl ends at l) =

Page 43: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

43

Contrastive Divergence

Page 44: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

44

Reversible jump MCMC Algorithm

• Three types of proposals Position switching: randomly select a segment j and

a new position assignment dj(i+1) ~U(dj-1

(i),dj+1(i))

Segment split: randomly select a segment j and split it into two segments where (dj

(i+1) , dj+1(i+1) ) =

G(dj-1(i) ,u(i) ) where u(i) ~ U

Segment merge: randomly select a segment j and merge segment j and j+1

• Simulated annealing reversible jump MCMC for computing y = argmax P(y|x) [Andireu et al, 2000]

Page 45: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

45

Simulated annealing reversible jump MCMC

Page 46: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

46

Protein Structural Graph for Beta-helix

Page 47: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

47

Protein Structure Determination

• Lab experiments: time and labor- consuming X-ray crystallography NMR spectroscopy Electron microscopy and many more

• Computational methods: Homology modeling: ≥ 30% sequence similarity Fold recognition: < 30% sequence similarity Ab inito methods: no template structure needed

• Active research area in multiple scientific fields

Page 48: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

48

Evaluation Measure

• Q3 (accuracy)

• Precision, Recall

• Segment Overlap quantity (SOV)

• Matthew’s Correlation coefficients

)(

)1()2;1(

)2;1()2;1(1)2,1(

iS

SLENSSMAXOV

SSDELTASSMINOV

NSSSOV

))(()()( iiiiiiii

iiiii

onunopup

ounpC

P + P-

T + P u

T - o nuonp

npQ

3

op

pQ pre

up

pQ

Page 49: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

49

Outline

• Brief introduction to protein structures

• Discriminative graphical models

• Generalized discriminative graphical models for protein fold recognition

• Experiment results

• Conclusion and discussion

Page 50: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

50

Graphical Models for Structured Prediction

• Conditional Random Fields Model conditional probability directly, not joint probability Allow arbitrary dependencies in observation (e.g. long

range, overlapping) Adaptive to different loss functions and regularizers Promising results in multiple applications

• Recent developments Alternative estimation algorithms (Collins, 2002, Dietterich et al,

2004) Alternative loss functions, use of kernels (Taskar et al., 2003,

Altun et al, 2003, Tsochantaridis et al, 2004) Baysian formulation (Qi and Minka, 2005) and semi-markov

version (Sarawagi and cohen, 2004)

Page 51: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

51

Local Information

• PSI-blast profile Position-specific scoring matrices (PSSM) Linear transformation[Kim & Park, 2003]

• SVM classifier with RBF kernel • Feature#1 (Si): Prediction score for each

residue Ri

)5(0.1

)55(1.05.0

)5(0

)(

xif

xifx

xif

xf

Page 52: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

52

Structured Data Prediction

• Data in many applications have inherent structures

John ate the cat .

SEQUENCEXS…WGIKQLQAR

Text sequence Parsing tree

Protein sequenceProtein structures

3-D imageSegmented objectsExampl

e

Input

Structures

• Fundamental importance in many areas• Potential for significant theoretical and practical advances

Page 53: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

53

Tertiary Fold Recognition

• Structural motif Identifiable arrangement of secondary structural elements Super-secondary structure, or protein fold

• Structural motif recognition Given a structural motif, predict its presence in a testing

sequence and produce the sequence to structure alignment, based on sequences only

β-α-β motif

Page 54: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

54

Training and Testing

• Training phase : learn the model parameters λ

Minimizing regularized log loss

Iterative search algorithms by seeking the direction whose empirical values agree with the expectation

• Testing phase: search the segmentation that maximizes P(y|x)

21

( , ) log ( )G

K

k k cc C k

L f Z

x y

( | )( ( , ) [ ( , )]) ( ) 0G

k c p k cc Ck

Lf E f

y xx y x y

Page 55: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

55

Future Work

+

Protein function

DSCTFTTAAAAKAGKAKAG

Protein sequence

Protein structure

Drug Design

Page 56: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

56

Experiment Setup• Cross-family validation

Focus on identification of non-homologous proteins

• Negative sets: PDB-minus set Non-homologous proteins in Protein Data Bank (less than

25%) – Positive sets

• Features Node features: regular expression template, predicted

secondary structure prediction scores, segment length, hydrophobicity and so on

Inter-node features: β-strand side-chain alignment scores, distance between nodes and so on

Features are general and easy to extend

Page 57: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

57

Major Challenges

• Protein structures are non-linear Long-range dependencies

• Structural similarity often does not indicate sequence similarity Sequence alignment reaches

twilight zone (under 25% similarity)Ubiquitin (blue) Ubx-Faf1 (gold)

β-α-β motif

Page 58: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

58

Conditional Random Fields [Lafferty et al, 2001]

• Model p (label y|observation x), not joint distribution

• Global normalization over undirected graphical models

• Allow arbitrary dependencies in observation (e.g. long range, overlapping)

• Adaptive to different loss functions and regularizers

Page 59: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

59

Quaternary Fold

• Quaternary structures Multiple sequences with tertiary

structures associated together to form a stable structure

Similar structure stabilization mechanism as tertiary structures

• Very limited research work to date Complex structures Few positive training data

Triple beta-spirals

Tumor necrosis factor (TNF)

Page 60: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

60

Protein Quaternary Fold Recognition

Seq 1: APA FSVSPA … SGACGP ECAESG

Seq 2 : DSCTFT…TAAAAKAGKAKCSTITL

Inter-chain and intra-chain dependencies

Training

Input: ..APAFSVSPASGACGPECA..

Output 1: Does the target fold exist?

Output 2:..

Yes

Testing

Page 61: Conditional Graphical Models for Protein Structure Prediction

Carnegie MellonSchool of Computer Science

61

Conditional Random Fields

• Promising results in multiple applications Tagging (Collins, 2002) and parsing (Sha and Pereira, 2003)

Information extraction (Pinto et al., 2003)

Image processing (Kumar and Hebert, 2004)

DNA sequence analysis (Bockhorst & Craven, 2005)