Avian Flu Bioinformatics NS-1 Motif analysis

In this exercise we will examine the NS-1 gene, using full coding sequence data sets, to determine the identity of the last four amino acids. Research has indicated that this PL motif can be used to make a determination as to the host / source of an influenza strain. Large-Scale Sequence Analysis of Avian Influenza Isolates – http://www.sciencemag.org/ SCIENCE VOL 311 17 MARCH 2006. In this research, analysis of the NS-1 AA coding sequence showed the following motifs associated with avian, human, and swine influenzas:

EPEV, ESEV - mostly avian
GPEV, GPKV - mostly swine
RPKV, RSKV - mostly human
KSEV - Spanish influenza of 1918

In human flu: The (PL signature) motifs for highly pathogenic human infections are below. So maybe we can comment on the binding studies of these signature motifs to the PDZ domain of the protein Disheveled (Dsh):

GSESEV (2003) – Avian influenza – strong binding, influenza is lethal
GSEPEV (1997) – Avian influenza - strong binding, influenza is lethal
GSKSEV (1918) – Avian influenza – very strong binding, influenza is very lethal
GSRSKV (low pathogenic) – human – binding is moderate, influenza is moderate

Now let’s look at the codons for G, K, E and R

G (Glycine)

K (Lysine)

E (Glutamic Acid)

R Arginine

CGT

AAA

GAA

CGU

CGC

AAG

GAG

CGC

GGA

E => K

GAA

CGA

GGG

 

GAG

CGG

The distance from E=> K is one letter
The distance from E=> G is one letter
The distance from R=> G is one letter in the wobble base

The distance from R to K is large (human to bird, hence the bird-to-bird mutation of one letter E to K was most likely as the source of the 1918 Influenza pandemic. Analysis of the influenza pandemic of 1957 H2N2 and the influenza pandemic of 1968 H3N2 was a human influenza. Below are 25 accession numbers representing the sequences and PL motifs that we are following through time and host.

H1N1 TIKSEV – A / Brevig Mission 1918 (human) AF333238
H1N1 TIRSEV – Puerto Rico 1934 (human) AF389122
H2N2 TIRSEV – Leningrad 1957(human) M81578
H1N1 TIRSEV - Leningrad 1954 (human) X52146
H3N2 TIRSKV - Aiche 1968 (human) M35094
H3N2 TIRSKV - England 1969 (human) AJ298949
H3N2 TIRSKV - Hong Kong 1973 (human) CY009008
H1N1 TIRSKV Chili 1983 X15282 (human) X15282
H3N2 TARSKV – Memphis 1995 (human) CY002276
H1N1 TIRSEV – Taiwan 1996 (human) AF055423
H5N1 TIEPEV – Hong Kong 1997 (human) AF036360
H3N2 TARSKV – Hong Kong 1997 (human) AF256182
H3N2 TARSKV – New York 1998 (human) CY001497
H5N1 TIESKV – Mongolia 2005 (goose) AB239306
H3N2 TARSKV – New York 1999 (human) CY001780
H3N2 TARSKV - New York 2000 (human) CY000469
H3N2 TIRSEV – New York 2001 (human) CY001956
H3N2 TARSEV – New York 2002 (human) CY000589
H3N2 TARSKV – New York 2003 (human) CY000905
H1N2 TARSKV – New York 2003 (human) CY002356
H3N2 TARSKV – New York 2004 (human) CY003076
H3N2 TARSKV – New York 2005 (human) CY002012
H3N2 TIRSEV – Influenza X- A virus 2003 AB036777

Lab exercises:

  1. BLAST Exercise – follow the changes in these motifs by BLASTing the protein database using the protein sequence from any of these accession numbers.
  2. Sequence exercise – go to http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html and search for sequences by year, subtype (serotype) or segment of the genome. Research H7N7 and H9N2, looking at the PL motif in NS1
  3. Sequence exercise – go to http://flu.lanl.gov/ and repeat the exercise above.
  4. Alignment exercise – use the alignment tools at each of the above databases.
  5. Practice using the ‘saved search and experiments’, make sure to ‘read the readme’

Questions:

  1. How long does the TIRSEV motif persist in any of the HxNy serotypes?
  2. When does the TIESEV / TIEPEV (avian) motif appear again in a flu data set?
  3. Does an avian motif (e.g. RSEV) appear in any recent H1N1 serotypes?
  4. Can you start to see the ‘mixing of motifs’ described in ‘recombinomics’?
  5. How stable are the avian motifs in NS1 across serotypes and over decades?

This lesson is copyrighted using an Educational Common License, and may be used freely without restriction for academic purposes.

Copyright © 2007 Robert D. Cormia - January 5, 2007