Swiss-PO

Frequently Asked Questions

What is the general idea behind Swiss-PO?
Where does the data come from?
How to use the different panels of Swiss-PO?
How to use the “Quick search” button?
Which nomenclature is used for the mutations in the variant table?
How was the data curated?
How was built the 3D structure database?
Why some PDB files related to a given protein are not present in Swiss-PO?
Do structures correspond to entities of biological interest?
To what corresponds to the order of appearance of the PDB files in the table of available structures?
How to choose the right structure for my analysis?
How does the visualization panel work ?
How to display interactions between the residue of interest and its environment ?
How to display protein complexes, when existing?
When double-clicking on a residue of the additional “green” chain, nothing happens. Why?
How were the sequence alignments of the proteins prepared?
How can I have more information about Swiss-PO web tool?
How can I cite the Swiss-PO web tool?

Answers

What is the general idea behind Swiss-PO?

Swiss-PO is a web tool developed to help predict the potential impact of uncharacterized mutations detected in cancer cells.

Where does the data come from?

Several databases were used to build Swiss-PO:

The Protein Data Bank (https://www.pdbe.org): the first open access digital data resource about macromolecules 3D structures.
UniProt/Swiss-Prot (https://www.uniprot.org): a high-quality and freely accessible resource of protein sequence and functional information.
CKB CORE (https://ckb.jax.org): a public digital resource for interpreting complex cancer genomic profiles in the context of protein structure and activity, therapies, and clinical trials.

How to use the different panels of Swiss-PO?

Panels are accompanied by a help button, in the form of a question mark icon on the right side of the panel headers, which opens a pop-up description of the main commands.

How to use the “Quick search” button?

The “Quick search” box allows selecting the protein of interest directly and centering the variant panel on the mutated residue, or the closest ones if no mutation has been reported previously for that residue in the CKB or UniProt/Swiss-Prot databases. This “Quick search” box takes a mutation as an input, following the Human Genome Variation Society recommendations for the description of protein sequence variants[4] (https://varnomen.hgvs.org).

Which nomenclature is used for the mutations in the variant table?

The nomenclature used for the mutation descriptions of the variant table follows the recommendations of the Human Genome Variation Society (https://varnomen.hgvs.org).

How was the data curated?

All data retrieved from different databases was manually checked and curated to insure a high-quality information to the customers. For more information regarding the 3D structure and the sequence databases, please refer to the Swiss-PO publication.

How was built the 3D structure database?

Based on our 50-panel list of genes, 3D structures were fetched from the PDB database based on their unique human UniProt code, to insure retrieving only PDB files of interest, and not be misled by synonymous names of different proteins.
Of note, residues in the PDB files were renumbered when necessary, so the residue numbers in the structure and sequence alignment for a given protein both correspond to the Uniprot numbering, used as a reference. Structures were manually annotated to identify mutated and missing residues that are unmentioned in the original PDB annotations.

Why some PDB files related to a given protein are not present in Swiss-PO?

To keep exclusively relevant structures, we selected only protein chains that have fewer than 30% missing residues compared to the similar region of their reference sequence.
Also, structures with more than 10 mutations were removed, as well as the ones that are judged irrelevant, redundant or with a wrong numbering due to an alignment failure. Note that we made an exception for gene GNA11 for which a structure with more than 10 mutations was selected as it was the only one available.
It might also happen that the structure was not available when Swiss-PO 3D database was made

Do structures correspond to entities of biological interest?

For each complex, when possible, the biological assembly was retrieved in the form corresponding to what has been demonstrated or believed to be the functional form of the molecule or complex.

To what corresponds to the order of appearance of the PDB files in the table of available structures?

Based on the visualization and the information available for the structures, we ranked the proteins in the structure table based on their relevance for the interpretation of mutations. Structures are ranked in this order:

Most representative structure available for the gene (e.g., high resolution, wild-type, presence of an FDA-approved ligand, important domain of the protein, etc.). For each gene, the top ranked structure will appear by default when opening the “PDB file” panel
Wild-type structures in presence of an FDA approved ligand
Structures containing an FDA-approved drug in the presence of a mutation
Wild-type and apo structures
Wild-type structures in presence of non-FDA approved ligands
Mutated structures in presence of non-FDA approved ligands
Mutated apo structures
Structures redundant with those listed in points 1-2
Structures redundant with those listed in points 4-6
Structures redundant with those listed in point 7

How to choose the right structure for my analysis?

Once you selected the gene that interests you, open the window that contains the table with the structural information (‘Choose a 3D structure’ button). The table gathers important information about the available structures and you can follow this plan:

Find structures in which the residue or region of interest is present. To see more structures, slide the filter button from ‘high’ to ‘low’.
You can select structures based on their resolution values. If the resolution value of a structure is above 2.7 Å, the structure is considered to have a low-resolution quality, while if it is between 2.7 and 1.8 Å, it is classified as medium resolution structures, and if it is below 1.8 Å, it is classified as high-resolution.
The presence of a ligand, FDA approved or not, help to see if the region or residue of interest is in the active site of the protein.
The presence of a macromolecule complexed to the protein helps to see if the region or residue of interest is in a protein interface
A mutation around the area or residue of interest can impact your analysis and should be taking into account, especially if it is known to impact the structure and activity of the protein (see the Variant table).

We recommend visualizing several structures per analysis as the crystallographic conditions, the presence of a ligand or partner or mutation, etc. can impact the protein structural conformation.

How does the visualization panel work ?

A double-click on a residue or a ligand centers the view on it, while displaying it in ball and stick and the close neighbors in stick. In parallel, if the double-click is done on a residue of the main chain, the table of variants and the sequence alignment of the orthologs will focus on the same residue. A single-click on a residue will center the system on it, without changing the selection of residues displayed in ball and stick or licorice.

How to display interactions between the residue of interest and its environment ?

A double-click on a residue of interest in the 3D panel, variant panel or sequence alignment will display several types of molecular interactions between the selected residue or ligand and its environment. It is possible to use the buttons available below the 3D viewer to switch on or off the display of these interactions according to their nature: hydrogen bonds, ionic interactions, cation-π interactions, hydrophobic contacts and π-stacking interactions. These interactions are displayed as dotted lines connecting the involved atoms. The color coding of the interaction types in the 3D display follows the one of the buttons that command their display or conceal.

How to display protein complexes, when existing?

By default, only the main chain, corresponding to the protein of interest, is displayed in the 3D structure panel. When several protein chains are available in the 3D structure, a “+” icon will be activated on the top left corner of the panel. This icon allows displaying all chains present in the 3D structure file. Other chains than the main one, will be displayed in ribbon and colored in green. The residues of the additional chains can be automatically shown if they are in the vicinity of a residue of the protein of interest on which the user double-clicked (or after its selection from the sequence alignment or variant panels), and their interactions displayed.

When double-clicking on a residue of the additional “green” chain, nothing happens. Why?

Contrarily to the residues in the main chain, i.e. the protein of interest, double clicking on a residue of the additional “green” chain will not change the focus of the multiple alignment and variant panels, which are dedicated only to the main chain (i.e., the selected protein of interest). Indeed, no information regarding mutations or sequence of the additional chains are necessarily present in the Swiss-PO database.

How were the sequence alignments of the proteins prepared?

For each protein of our 50-panel list, the human amino acid sequence was retrieved from the UniProt database.
We carefully selected the orthologous sequences for each protein from 9 other organisms. Some of these organisms were selected for their proximity to human, like chimpanzee and macaque, while others were taken from different taxonomic groups present in vertebrates (mouse, rat, dog, bovine, chicken, zebrafish and frog). This selection of species was inspired by the National Center of Biotechnology Information reference sequence organisms that are used for comparative analysis of sequences.
We removed sequences that generate many gaps in the sequence alignment or that don’t cover important domains to obtain readable and high-quality sequence alignments. All sequence alignments were done using with the MUltiple Sequence Comparison by Log-Expectation tool (MUSCLE).
The most relevant sequence was retained for each species and, in certain cases, unrewieved sequences were preferred to reviewed ones as their resulting alignments were more relevant.

How can I have more information about Swiss-PO web tool?

For more information, you can read the corresponding paper and its supplementary documents.

How can I cite the Swiss-PO web tool?

If you use or publish analysis done with Swiss-PO, please cite the related Swiss-PO paper:
Fanny S Krebs, Vincent Zoete, Maxence Trottet, Timothée Pouchon, Christophe Bovigny, Olivier Michielin (2020), in preparation