diff --git a/structure/symmetry.md b/structure/symmetry.md index 1940106..ef6a8f8 100644 --- a/structure/symmetry.md +++ b/structure/symmetry.md @@ -1,14 +1,14 @@ Protein Symmetry using BioJava ================================================================ -BioJava can be used to detect, analyze, and visualize **symmetry** and -**pseudo-symmetry** in the **quaternary** (biological assembly) and tertiary +BioJava can be used to detect, analyze, and visualize **symmetry** and +**pseudo-symmetry** in the **quaternary** (biological assembly) and tertiary (**internal**) structural levels. ## Quaternary Symmetry -The **quaternary symmetry** of a structure defines the relations between -its individual chains or groups of chains. For a more extensive explanation +The **quaternary symmetry** of a structure defines the relations between +its individual chains or groups of chains. For a more extensive explanation about symmetery visit the [PDB help page](http://www.rcsb.org/pdb/staticHelp.do?p=help/viewers/jmol_symmetry_view.html). In the **quaternary symmetry** detection problem, we are given a set of chains @@ -19,7 +19,7 @@ relates them. The solution is divided into the following steps: in the pseudo-symmetry case). For that, we perform a pairwise alignment of all chains and determine **clusters of identical chains**. 2. Next, we reduce the each chains to a single point, its **centroid** (center of mass). -3. After that, we try different **symmetry relations** to superimpose the chain centroids +3. After that, we try different **symmetry relations** to superimpose the chain centroids and obtain their RMSD. 4. At last, based on the parameters (cutoffs), we determine the **overall symmetry** of the structure, with the symmetry relations obtained in the previous step. @@ -36,16 +36,20 @@ Structure s; //Set some parameters if needed different than DEFAULT - see descriptions QuatSymmetryParameters parameters = new QuatSymmetryParameters(); -parameters.setVerbose(true); //print information +SubunitClustererParameters clusterParams = new SubunitClustererParameters(); //Instantiate the detector -QuatSymmetryDetector detector = QuatSymmetryDetector(structure, parameters); +QuatSymmetryDetector detector = QuatSymmetryDetector(s, parameters, clusterParams); -//The getters calculate the quaternary symmetry automatically -List globalResults = detector.getGlobalSymmetry(); -List> localResults = detector.getLocalSymmetries(); +//Static methods in QuatSymmetryDetector perform the calculation +QuatSymmetryResults globalResults = QuatSymmetryDetector.getGlobalSymmetry(s, parameters, clusterParams); +List localResults = QuatSymmetryDetector.getLocalSymmetries(s, parameters, clusterParams); ``` +See also the demo in the BioJava repo: + +https://github.com/biojava/biojava/blob/885600670be75b7f6bc5216bff52a93f43fff09e/biojava-structure/src/main/java/demo/DemoSymmetry.java#L37-L59 + The return type are `List` because there can be multiple valid options for the quaternary symmetry. The local results `List` is empty if there exist no local symmetry in the structure, and the global results `List` has always size bigger @@ -83,35 +87,35 @@ only applies to a subset of chains. ### Pseudo-Symmetry In **pseudo-symmetry** the chains related by the symmetry are not completely -identical, but they share a sequence similarity above the pseudo-symmetry +identical, but they share a sequence similarity above the pseudo-symmetry similarity threshold. -If we consider hemoglobin, at a 95% sequence identity threshold the alpha and -beta subunits are considered different, which correspond to an A2B2 stoichiometry -and a C2 point group. At the structural similarity level, all four chains are -considered homologous (~45% sequence identity) with an A4 pseudostoichiometry and -D2 pseudosymmetry. +If we consider hemoglobin, at a 95% sequence identity threshold the alpha and +beta subunits are considered different, which correspond to an A2B2 stoichiometry +and a C2 point group. At the structural similarity level, all four chains are +considered homologous (~45% sequence identity) with an A4 pseudostoichiometry and +D2 pseudosymmetry. ![PDB ID 4HHB](img/symm_pseudo.png) ## Internal Symmetry -**Internal symmetry** refers to the symmetry present in a single chain, that is, -the tertiary structure. The algorithm implemented in biojava to detect internal +**Internal symmetry** refers to the symmetry present in a single chain, that is, +the tertiary structure. The algorithm implemented in biojava to detect internal symmetry is called **CE-Symm**. ### CE-Symm -The **CE-Symm** algorithm was originally developed by [Myers-Turnbull D., Bliven SE., +The **CE-Symm** algorithm was originally developed by [Myers-Turnbull D., Bliven SE., Rose PW., Aziz ZK., Youkharibache P., Bourne PE. & Prlić A. in 2014] (http://www.sciencedirect.com/science/article/pii/S0022283614001557) [![pubmed](http://img.shields.io/badge/in-pubmed-blue.svg?style=flat)](http://www.ncbi.nlm.nih.gov/pubmed/24681267). As the name of the algorithm explicitly states, **CE-Symm** uses the Combinatorial -Extension (**CE**) algorithm to generate an alignment of the structure chain to itself, -disabling the identity alignment (the diagonal of the **DotPlot** representation of a -structure alignment). This allows the identification of alternative self-alignments, +Extension (**CE**) algorithm to generate an alignment of the structure chain to itself, +disabling the identity alignment (the diagonal of the **DotPlot** representation of a +structure alignment). This allows the identification of alternative self-alignments, which are related to symmetry and/or structural repeats inside the chain. -By a procedure called **refinement**, the subunits of the chain that are part of the symmetry +By a procedure called **refinement**, the subunits of the chain that are part of the symmetry are defined and a **multiple alignment** is created. This process can be thought as to divide the chain into other subchains, and then superimposing each subchain to each other to create a multiple alignment of the subunits, respecting the symmetry axes. @@ -156,7 +160,7 @@ System.out.println(pg.getSymmetry()); To enable some extra features in the display, a `SymmetryDisplay` class has been created, although the `MultipleAlignmentDisplay` method -can also be used for that purpose (it will not show symmetry axes or +can also be used for that purpose (it will not show symmetry axes or symmetry menus). Lastly, the `SymmetryGUI` class in the **structure-gui** package @@ -166,7 +170,7 @@ to the GUI to trigger structure alignments. ### Symmetry Display The symmetry display is similar to the **quaternary symmetry**, because -part of the code is shared. See for example this beta-propeller (1U6D), +part of the code is shared. See for example this beta-propeller (1U6D), where the repeated beta-sheets are connected by a linker forming a C6 point group internal symmetry: @@ -175,10 +179,10 @@ point group internal symmetry: #### Hierarchical Symmetry One additional feature of the **internal symmetry** display is the representation -of hierarchical symmetries and repeats. Contrary to point groups, some structures -have different **levels** of symmetry. That is, the whole strucutre has, e.g. C2 -symmetry and, at the same time, each of the two parts has C2 symmetry, but the axes -of both levels are not related by a point group (i.e. they do not cross to a single +of hierarchical symmetries and repeats. Contrary to point groups, some structures +have different **levels** of symmetry. That is, the whole strucutre has, e.g. C2 +symmetry and, at the same time, each of the two parts has C2 symmetry, but the axes +of both levels are not related by a point group (i.e. they do not cross to a single point). A very clear example are the beta-gamma-crystallins, like 4GCR: @@ -187,14 +191,14 @@ A very clear example are the beta-gamma-crystallins, like 4GCR: #### Subunit Multiple Alignment -Another feature of the display is the option to show the **multiple alignment** of +Another feature of the display is the option to show the **multiple alignment** of the symmetry related subunits created during the **refinement** process. Search for -the option *Subunit Superposition* in the *symmetry* menu of the Jmol window. For +the option *Subunit Superposition* in the *symmetry* menu of the Jmol window. For the previous example the display looks like that: ![PDB ID 4GCR](img/symm_subunits.png) -The subunit display highlights the differences and similarities between the symmetry +The subunit display highlights the differences and similarities between the symmetry related subunits of the chain, and helps the user to identify conseved and divergent regions, with the help of the *Sequence Alignment Panel*. @@ -202,9 +206,9 @@ regions, with the help of the *Sequence Alignment Panel*. Finally, the internal and quaternary symmetries can be combined to obtain the global overall combined symmetry. As we have seen before, the protein 1VYM is a DNA-clamp that -has three chains relates by C3 symmetry. Each chain is internally C2 symmetric, and each -part of the C2 internal symmetry is C2 symmetric, so a case of **hierarchical symmetry** -(C2 + C2). Once we have divided the whole structure into its asymmetric parts, we can +has three chains relates by C3 symmetry. Each chain is internally C2 symmetric, and each +part of the C2 internal symmetry is C2 symmetric, so a case of **hierarchical symmetry** +(C2 + C2). Once we have divided the whole structure into its asymmetric parts, we can analyze the global symmetry that related each one of them. The interesting result is that in some cases, the internal symmetry **multiplies** the point group of the quaternary symmetry. What seemed a C3 + C2 + C2 is combined into a D6 overall symmetry, as we can see in the figure