The incredible expanding universe of amino acids - Part 1


There are 23 currently known, natural, genetically encoded amino acids - they are pictured above, ordered by the number of heavy atoms contained within them. There are the core 20, then the additional, more unusual ones selenocysteine, N-formyl-methionine and pyrrolysine - the latter two are used only in bacteria. Post-translationally, many further covalent modifications are found, for example phosphorylation of serine, threonine, tyrosine and histidine, but the above is the core building blocks of proteins, the incredible chemical diversity of the proteome can be through of as 'edits' on this core genetic set.

All of the above are alpha amino acids, and all, with the exception of glycine have defined stereochemistry at the alpha carbon (they are all L-amino acids). Three of the amino acids have defined chirality in their side chains (isoleucine, threonine and pyrrolysine). There are only six elements used within this set (Carbon, Hydrogen, Nitrogen, Oxygen, Sulphur and Selenium). For me, to think that (along with some non-genetically encoded cofactors, such as ATP, Zinc, etc.) all the chemistry going on in our bodies comes from this simplicity is amazing. Of course, most of this complexity of function comes from the fact that the amino-acids can form polymers, so, although from the 20 'common' genetically encoded amino acids, there are 202 (400) possible dipeptides, 203 tripeptides, and so on; so the chemical space of peptides gets big, quickly - for a decapeptide there are over 10 trillion possible covalently distinct peptides (well more than that actually, if the presence of free thiol and connectivity isomers of disulphide bonds are considered). This diversity of peptides has been well explored, but we became interested some time ago in identifying other useful amino acids.

You can go a long way with this set of building blocks, however, sometimes it is desirable to include other amino acids into drugs or proteins, for example the drug Desmopressin, a vasopressin receptor agonist has a deaminated N-terminus and the unnatural chiral form of Arginine (D-Arginine) at the eighth position - these improve drug properties compared to dosing with the natural peptide. Occasionally, more radically different amino acids are used (e.g. Aib, alpha-aminoisobutyric acid at position 2 of the clinical candidate Taspoglutide (a GLP-1 mimetic)).