Deprecated: Methods with the same name as their class will not be constructors in a future version of PHP; Sql has a deprecated constructor in /home/k/knapp/public_html/sparrow/include/sql.php on line 2
Protein secondary structure prediction with SPARROW

Content
Navigation

Home » Protein secondary structure prediction service

SPARROW server help

 

This page provides some useful help considering job submissions to the SPARROW server, explaining the supported input formats and the server's output.

 

FASTA files:

The amino acids should be represented by their one letter codes (in the future we may also support three letter codes). If multiple sequences are submitted each should be introduced by a line starting with either '#' or '>'. Any lines starting with such characters will be used exclusively as separators and will be otherwise ignored. You may split sequences in multiple lines. No secondary structure prediction will be performed for non-alphabetic entries in the sequence. Mind also that selenocysteine, represented by the one letter code 'U' (or 'u') is translated into an 'x' by PSI-BLAST.

PSI-BLAST profiles:

If you are uploading sequence profiles generated with PSI-BLAST (using the option '-Q'), you don't need to worry about anything. In case you are not, you should mind that the profile contain one line for each amino acid in the sequence. Such line is expected to contain the affinities of the 20 natural amino acids alphabetically ordered according to their three letter codes (I know, sorry....). You may of course provide the amino acid one letter codes. These should be placed immediately before the affinities.

 

Example of a PSI-BLAST profile


A R N D C Q E G H I L K M F P S T W Y V
... M -1 -2 -3 -4 -2 -2 -3 -4 -3 +3 +2 -2 +5 +0 -3 -2 -1 -2 -1 +2 ...
... K -1 +2 +0 -1 -4 +1 +1 -2 -1 -3 -3 +5 -2 -4 -1 +0 -1 -3 -2 -3 ...
... N -1 -2 +4 +0 -3 -1 -1 +3 -1 -4 -4 -1 -3 -4 +5 +0 -1 -4 -3 -3 ...
... D -2 -2 +1 +7 -4 -1 +1 -2 -1 -4 -4 -1 -4 -4 -2 -1 -1 -5 -4 -4 ...
... R -2 +6 -1 -2 -4 +1 +0 -3 -1 -3 -3 +2 -2 -3 -3 -1 -1 -3 -2 -3 ...
... T +0 -1 +0 -1 -1 -1 -1 -1 -2 -1 -2 -1 -1 -3 -1 +3 +5 -3 -2 -1 ...
... L +2 -2 -3 -3 -2 -2 -2 -2 -1 +0 +2 -2 +0 +1 -2 -1 -1 +0 +4 +0 ...
... Q -1 +0 -1 -1 -3 +5 +1 -3 +0 -1 +2 +0 +0 -2 -2 -1 -1 -2 -2 -1 ...
... A +5 -2 -2 -2 -1 -1 -1 +0 -2 -2 -2 -1 -1 -3 -1 +1 +0 -3 -2 +0 ...
... I +2 -3 -3 -3 -1 -2 -3 -2 -3 +4 +1 -2 +0 -1 -2 -1 -1 -3 -2 +2 ...
... G +2 +3 -1 -2 -2 -1 -1 +3 -2 -3 -3 +0 -2 -3 -2 +0 -1 -3 -3 -2 ...
... R -2 +6 -1 -2 -4 +1 +0 -3 -1 -3 -3 +2 -2 -3 -3 -1 -1 -3 -2 -3 ...
... Q -1 +1 +0 -1 -3 +6 +2 -2 +0 -3 -3 +1 -1 -4 -2 +0 -1 -2 -2 -3 ...

....




















 

Note: SPARROW will ignore any further information provided (represented here by dots).

 

You may submit multiple profiles under the condition that each be introduced by a line starting with '#' or '>'.

output:

SPARROW and *SPARROW will output a secondary structure prediction string parallel to the amino acid sequence string. In the output 'H' is understood to mean helical, 'E' is understood to mean extended, and 'O' is understood to mean other. When the 'EVA standard' output type is selected the 3-turn and 5-turn helix types are assigned to the helical class together with the alpha-helix; the beta-bridges are considered extended as much as the beta-strands. The remaining classes together form the other class. Also 'altered EVA' can be selected, which differs from the 'EVA standard' by the beta-bridges' assignment to the other class. Aditionally a 'strict' assignment exists, selecting which only alpha-helices are considered as being helical and beta-strand as extended, while everything else is considered other. The latter two assigments are by now only supported by *SPARROW.


secondary structure types in three state prediction
output typeDSSP type
'H' 'G' 'I' 'E' 'B' 'S' 'T' ' '
EVA standard H H H E E O O O
Altered EVA (*) H H H E O O O O
Strict (*) H O O E O O O O
('H' = helical, 'E' = extended, 'O' = other)

In addition *SPARROW is also capable of predicting the eight DSSP types directly thus performing an eight state prediction. The letters used for the eight classes are identical with the letters used by DSSP, except for the case of the blank for which the letter 'O' is used.


secondary structure types in eight state prediction
output typeDSSP type
'H' 'G' 'I' 'E' 'B' 'S' 'T' ' '
DSSP (*) H G I E B S T O

(*) = works for *SPARROW only

The secondary structure string is coloured according to the prediction confidence value, which is reported in a third line of output as well. The following table shows the confidence values corresponding to each colour and the relative expected accuracy (i.e. the probability of a prediction being correct). The latter were calculated out of the average specificities and accuracies for each confidence value.

 

SPARROW prediction confidence table
confidence
level
1 2 3 4 5 6 7 8 9

probability
of correctness

(%)

<50 55 65 70 80 85 90 95 ~99    helical
<50 55 60 70 75 80 90 95 ~98    extended
40 50 55 65 70 80 85 90 >95    other
<45 55 60 65 75 80 90 95 >95    global