Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 39 additions & 35 deletions structure/symmetry.md
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
Protein Symmetry using BioJava
================================================================

BioJava can be used to detect, analyze, and visualize **symmetry** and
**pseudo-symmetry** in the **quaternary** (biological assembly) and tertiary
BioJava can be used to detect, analyze, and visualize **symmetry** and
**pseudo-symmetry** in the **quaternary** (biological assembly) and tertiary
(**internal**) structural levels.

## Quaternary Symmetry

The **quaternary symmetry** of a structure defines the relations between
its individual chains or groups of chains. For a more extensive explanation
The **quaternary symmetry** of a structure defines the relations between
its individual chains or groups of chains. For a more extensive explanation
about symmetery visit the [PDB help page](http://www.rcsb.org/pdb/staticHelp.do?p=help/viewers/jmol_symmetry_view.html).

In the **quaternary symmetry** detection problem, we are given a set of chains
Expand All@@ -19,7 +19,7 @@ relates them. The solution is divided into the following steps:
in the pseudo-symmetry case). For that, we perform a pairwise alignment of all
chains and determine **clusters of identical chains**.
2. Next, we reduce the each chains to a single point, its **centroid** (center of mass).
3. After that, we try different **symmetry relations** to superimpose the chain centroids
3. After that, we try different **symmetry relations** to superimpose the chain centroids
and obtain their RMSD.
4. At last, based on the parameters (cutoffs), we determine the **overall symmetry** of the
structure, with the symmetry relations obtained in the previous step.
Expand All@@ -36,16 +36,20 @@ Structure s;

//Set some parameters if needed different than DEFAULT - see descriptions
QuatSymmetryParameters parameters = new QuatSymmetryParameters();
parameters.setVerbose(true); //print information
SubunitClustererParameters clusterParams = new SubunitClustererParameters();

//Instantiate the detector
QuatSymmetryDetector detector = QuatSymmetryDetector(structure, parameters);
QuatSymmetryDetector detector = QuatSymmetryDetector(s, parameters, clusterParams);

//The getters calculate the quaternary symmetry automatically
List<QuatSymmetryResults> globalResults = detector.getGlobalSymmetry();
List<List<QuatSymmetryResults>> localResults = detector.getLocalSymmetries();
//Static methods in QuatSymmetryDetector perform the calculation
QuatSymmetryResults globalResults = QuatSymmetryDetector.getGlobalSymmetry(s, parameters, clusterParams);
List<QuatSymmetryResults> localResults = QuatSymmetryDetector.getLocalSymmetries(s, parameters, clusterParams);

```
See also the demo in the BioJava repo:

https://github.com/biojava/biojava/blob/885600670be75b7f6bc5216bff52a93f43fff09e/biojava-structure/src/main/java/demo/DemoSymmetry.java#L37-L59

The return type are `List` because there can be multiple valid options for the
quaternary symmetry. The local results `List` is empty if there exist no local
symmetry in the structure, and the global results `List` has always size bigger
Expand DownExpand Up@@ -83,35 +87,35 @@ only applies to a subset of chains.
### Pseudo-Symmetry

In **pseudo-symmetry** the chains related by the symmetry are not completely
identical, but they share a sequence similarity above the pseudo-symmetry
identical, but they share a sequence similarity above the pseudo-symmetry
similarity threshold.

If we consider hemoglobin, at a 95% sequence identity threshold the alpha and
beta subunits are considered different, which correspond to an A2B2 stoichiometry
and a C2 point group. At the structural similarity level, all four chains are
considered homologous (~45% sequence identity) with an A4 pseudostoichiometry and
D2 pseudosymmetry.
If we consider hemoglobin, at a 95% sequence identity threshold the alpha and
beta subunits are considered different, which correspond to an A2B2 stoichiometry
and a C2 point group. At the structural similarity level, all four chains are
considered homologous (~45% sequence identity) with an A4 pseudostoichiometry and
D2 pseudosymmetry.

![PDB ID 4HHB](img/symm_pseudo.png)

## Internal Symmetry

**Internal symmetry** refers to the symmetry present in a single chain, that is,
the tertiary structure. The algorithm implemented in biojava to detect internal
**Internal symmetry** refers to the symmetry present in a single chain, that is,
the tertiary structure. The algorithm implemented in biojava to detect internal
symmetry is called **CE-Symm**.

### CE-Symm

The **CE-Symm** algorithm was originally developed by [Myers-Turnbull D., Bliven SE.,
The **CE-Symm** algorithm was originally developed by [Myers-Turnbull D., Bliven SE.,
Rose PW., Aziz ZK., Youkharibache P., Bourne PE. & Prlić A. in 2014]
(http://www.sciencedirect.com/science/article/pii/S0022283614001557) [![pubmed](http://img.shields.io/badge/in-pubmed-blue.svg?style=flat)](http://www.ncbi.nlm.nih.gov/pubmed/24681267).
As the name of the algorithm explicitly states, **CE-Symm** uses the Combinatorial
Extension (**CE**) algorithm to generate an alignment of the structure chain to itself,
disabling the identity alignment (the diagonal of the **DotPlot** representation of a
structure alignment). This allows the identification of alternative self-alignments,
Extension (**CE**) algorithm to generate an alignment of the structure chain to itself,
disabling the identity alignment (the diagonal of the **DotPlot** representation of a
structure alignment). This allows the identification of alternative self-alignments,
which are related to symmetry and/or structural repeats inside the chain.

By a procedure called **refinement**, the subunits of the chain that are part of the symmetry
By a procedure called **refinement**, the subunits of the chain that are part of the symmetry
are defined and a **multiple alignment** is created. This process can be thought as to
divide the chain into other subchains, and then superimposing each subchain to each other to
create a multiple alignment of the subunits, respecting the symmetry axes.
Expand DownExpand Up@@ -156,7 +160,7 @@ System.out.println(pg.getSymmetry());

To enable some extra features in the display, a `SymmetryDisplay`
class has been created, although the `MultipleAlignmentDisplay` method
can also be used for that purpose (it will not show symmetry axes or
can also be used for that purpose (it will not show symmetry axes or
symmetry menus).

Lastly, the `SymmetryGUI` class in the **structure-gui** package
Expand All@@ -166,7 +170,7 @@ to the GUI to trigger structure alignments.
### Symmetry Display

The symmetry display is similar to the **quaternary symmetry**, because
part of the code is shared. See for example this beta-propeller (1U6D),
part of the code is shared. See for example this beta-propeller (1U6D),
where the repeated beta-sheets are connected by a linker forming a C6
point group internal symmetry:

Expand All@@ -175,10 +179,10 @@ point group internal symmetry:
#### Hierarchical Symmetry

One additional feature of the **internal symmetry** display is the representation
of hierarchical symmetries and repeats. Contrary to point groups, some structures
have different **levels** of symmetry. That is, the whole strucutre has, e.g. C2
symmetry and, at the same time, each of the two parts has C2 symmetry, but the axes
of both levels are not related by a point group (i.e. they do not cross to a single
of hierarchical symmetries and repeats. Contrary to point groups, some structures
have different **levels** of symmetry. That is, the whole strucutre has, e.g. C2
symmetry and, at the same time, each of the two parts has C2 symmetry, but the axes
of both levels are not related by a point group (i.e. they do not cross to a single
point).

A very clear example are the beta-gamma-crystallins, like 4GCR:
Expand All@@ -187,24 +191,24 @@ A very clear example are the beta-gamma-crystallins, like 4GCR:

#### Subunit Multiple Alignment

Another feature of the display is the option to show the **multiple alignment** of
Another feature of the display is the option to show the **multiple alignment** of
the symmetry related subunits created during the **refinement** process. Search for
the option *Subunit Superposition* in the *symmetry* menu of the Jmol window. For
the option *Subunit Superposition* in the *symmetry* menu of the Jmol window. For
the previous example the display looks like that:

![PDB ID 4GCR](img/symm_subunits.png)

The subunit display highlights the differences and similarities between the symmetry
The subunit display highlights the differences and similarities between the symmetry
related subunits of the chain, and helps the user to identify conseved and divergent
regions, with the help of the *Sequence Alignment Panel*.

## Combined Global Symmetry

Finally, the internal and quaternary symmetries can be combined to obtain the global
overall combined symmetry. As we have seen before, the protein 1VYM is a DNA-clamp that
has three chains relates by C3 symmetry. Each chain is internally C2 symmetric, and each
part of the C2 internal symmetry is C2 symmetric, so a case of **hierarchical symmetry**
(C2 + C2). Once we have divided the whole structure into its asymmetric parts, we can
has three chains relates by C3 symmetry. Each chain is internally C2 symmetric, and each
part of the C2 internal symmetry is C2 symmetric, so a case of **hierarchical symmetry**
(C2 + C2). Once we have divided the whole structure into its asymmetric parts, we can
analyze the global symmetry that related each one of them. The interesting result is that
in some cases, the internal symmetry **multiplies** the point group of the quaternary symmetry.
What seemed a C3 + C2 + C2 is combined into a D6 overall symmetry, as we can see in the figure
Expand Down