Next Article in Journal
Dynamical Behavior of Small-Scale Buoyant Diffusion Flames in Externally Swirling Flows
Next Article in Special Issue
Algebraic Nexus of Fibonacci Forms and Two-Simplex Topology in Multicellular Morphogenesis
Previous Article in Journal
Interior Multi-Peak Solutions for a Slightly Subcritical Nonlinear Neumann Equation
Previous Article in Special Issue
p-Numerical Semigroups of Generalized Fibonacci Triples
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fibonacci-like Sequences Reveal the Genetic Code Symmetries, Also When the Amino Acids Are in a Physiological Environment

Physics Department, Faculty of Exactand Applied Science, University Oran 1 Ahmed Ben Bella, Oran 31100, Algeria
Symmetry 2024, 16(3), 293; https://doi.org/10.3390/sym16030293
Submission received: 5 January 2024 / Revised: 30 January 2024 / Accepted: 29 February 2024 / Published: 2 March 2024

Abstract

:
In this study, we once again use a set of Fibonacci-like sequences to examine the symmetries within the genetic code. This time, our focus is on the physiological state of the amino acids, considering them as charged, in contrast to our previous work where they were seen as neutral. In a pH environment around 7.4, there are four charged amino acids. We utilize the properties of our sequences to accurately describe the symmetries in the genetic code table. These include Rumer’s symmetry, the third-base symmetry and the “ideal” symmetry, along with the “supersymmetry” classification schemes. We also explore the special chemical structure of the amino acid proline, presenting two perspectives—shCherbak’s view and the Downes–Richardson view—which are included in the description of the above-mentioned symmetries. Our investigation also employs elementary modular arithmetic to precisely describe the chemical structure of proline, connecting the two views seamlessly. Finally, our Fibonacci-like sequences prove instrumental in quickly establishing the multiplet structure of non-standard versions of the genetic code. We illustrate this with an example, showcasing the efficiency of our method in unraveling the complex relationships within the genetic code.

1. Introduction

This paper is a continuation of a previous one, devoted to the study of the genetic code, using a novel mathematical technique based on a small set of Fibonacci-like sequences [1]. In this reference, we used these sequences, as well as some tools from elementary number theory, to derive the detailed chemical content of the amino acids encoded by the 61 sense codons, including their degeneracies, and structured by three symmetries. In the above work, the 20 amino acids were considered in their neutral (uncharged) state. In the present work, we consider an extension where four amino acids are now considered in a physiological state (neutral pH), that is, charged. As in [1], we use our Fibonacci-like sequences to derive several hydrogen atom and atom patterns corresponding to the symmetries of the 64-codon genetic code table, mentioned above. In doing so, we also consider two possible views linked to the special structure of the amino acid proline, which is known to be the only amino acid whose side chain is bound to its backbone twice. We also derive, in a new way from our Fibonacci-like sequences, the exact degeneracy structure of the standard genetic code as well as the correct number of amino acids, that is, five quartets (four codons each), three sextets (six codons each; 6 = 4 + 2), nine doublets (two codons each), one triplet (three codons), two singlets (one codon each), and finally the stop codons. As another application, we derive the exact multiplet structure of one of the non-standard versions of the genetic code, the Alternative Yeast Nuclear Code, as an example, and give hints for the application to other non-standard forms. Below, in this introduction, to give the paper a self-contained structure, we first give a summary of the (standard) genetic code (Section 1.1) and, next, the elemental (atomic) composition of the twenty amino acids (Section 1.2).

1.1. The Genetic Code

The genetic code is a set of rules used by the living organisms on Earth to translate the information contained in the genetic material (the genes) into proteins. Its experimental deciphering was beautifully realized in the 1960s [2]. Out of a total of 64 possible codons, each being a combination of one of the four bases U (uracil), C (cytosine), A (adenine), and G (guanine), there are in the standard genetic code 61 sense codons, and each one of them is translated, by the biochemical machinery of the ribosome, into a given amino acid; the remaining three (non-sense) codons serve as termination signals or stop codons. The genetic code is also said to be degenerate, meaning that specific groups of codons correspond to an amino acid, we call them here “multiplets”. The sextets are coded by six codons, the quartets by four codons, the triplet by three codons, the doublets by two codons, and finally the singlets by only one codon. These multiplets are gathered in Table 1 where the one-letter and the three-letter codes for the amino acids are given in parentheses. In Table 2, the genetic code table, i.e., the codon–amino acid correspondence, is shown.
In this table, there are 16 family boxes, and each one of them is a set of four codons sharing the same first and second base. An important peculiarity of the (standard) genetic code is the existence of the three sextets serine {UCN, AGY}, arginine {CGN, AGR}, and leucine {CUN, UUR} (N for any base, Y for pyrimidine U or C, and R for purine A or G). These three sextets have their codons distributed over separate family boxes, that is, each 6-fold codon set is composed of separate 4-fold and 2-fold parts. There are also important symmetries of the genetic code, and these will play a prominent role in this paper, as in [1] (see Section 4, Section 5 and Section 6).

1.2. The Elemental Composition of the 20 Amino Acids

Below, in Table 3, we give the elemental composition of the twenty amino acids, where four of them are in their charged (physiological) state. They are arginine (charge +1), lysine (charge +1), glutamic acid (charge −1), and aspartic acid (charge −1). These charges are indicated in colors in the table (red for +1 and blue for −1). H in the third column is for hydrogen, C in the fourth column is for carbon, and N, O, and S in the fifth column correspond respectively to nitrogen, oxygen, and sulfur. Atom numbers are given in the sixth column, and the integer molecular mass (nucleon number) is shown in the seventh column. All the given numbers correspond to the side chains of the amino acids. (Let us note, here, that when an amino acid contains both an amine group with charge plus and a carboxylic group with charge minus in its backbone, it is called a zwitterion and has an overall neutral charge. This has therefore no impact on the sequel of this paper, inasmuch as we are concerned, except for proline, only with the side chains of the 20 amino acids.) The number of codons, or multiplicity M, encoding each amino acid and its name together with its three-letter symbol are given in columns 1 and 2, respectively. To ease the calculations in the next sections, one can use, as we indeed do, the following pre-calculated sums for the hydrogen, atom, and nucleon contents (in the uncharged amino acid side chains). Hydrogen atoms: 21 in the five quartets, 22 in the three sextets, 50 in the nine doublets, 9 in the one triplet, 15   ( 7 + 8 ) in the two singlets (see Table 3). For the atom number: 31 in the five quartets, 35 in the three sextets, 96 in the nine doublets, 13 in the triplet, 29   ( 11 + 18 ) in the two singlets (see Table 3). For the nucleon numbers: 145 in the five quartets, 188 in the three sextets, 660 in the nine doublets, 57 in the one triplet, 205   75 + 130 in the two singlets (see Table 3). Now, in the computations below, in the next sections, the charges for some amino acids are to be included, when needed, and without forgetting, of course, the multiplicities or the degeneracies. Recall that, for an amino acid of multiplicity M, that is, the number of codons coding it, the degeneracy is simply equal to M 1 . In the last five rows of Table 3, several hydrogen atom, atom, and nucleon numbers have been calculated to ease the reading. Several of them, but not all, are involved in Section 4, Section 5 and Section 6.
An amino acid’s molecular structure is given by the brut formula R C H N H 2 C O O H in which R stands for the side chain (or radical) and the remaining portion for the backbone. The only amino acid with two connections between its side chain and backbone is proline, so there is, therefore, no “clearcut” between its two components, as is the case for all the other 19 amino acids. In this work, this special amino acid is considered in two ways, described below. In [3], shCherbak suggested a fictitious “borrowing” of one nucleon (one hydrogen atom) from the side chain of proline, which has only 73 nucleons in its backbone, in favor of the latter reaching 74, as is the case for the other 19 amino acids, in order to “standardize” the common backbone of the amino acids, which has 74 nucleons. He referred to the aforementioned “borrowing” procedure as the “activation key” in his subsequent work with Makukov [4]. With the 20 amino acids taken into consideration in their neutral (uncharged) form, many remarkable and beautiful arithmetical patterns result from activating the key, or standardizing. On the other hand, Downes and Richardson [5] have chosen the other way, that is, to not carry out such a “borrowing”, leaving proline’s side chain with its 42 nucleons, contrary to shCherbak’s choice of 41 nucleons. These authors also derived a no less remarkable nucleon (or integer molecular mass) balance with this choice, together with considering the case where four amino acids are in their charged state. In the following sections, we consider both cases concerning proline, referred to as “activation keyon (shCherbak’s view) and “activation keyoff (Downes and Richardson view), with the four amino acids, mentioned above, in their charged states. The data for proline, in this context, are shown in Table 3, noted respectively as “on/off” (second row). In the computations below, concerning the situation where the “activation key” is on or off for proline, a factor of “ + 1 ” is added to the hydrogen number, atom number, and nucleon number in the case of “off” and nothing in the case of “on”.

1.3. The Structure of the Paper

In Section 2, we present our set of Fibonacci-like sequences. In Section 3, we present, as a first application of our Fibonacci-like sequences, the hydrogen atom content in the side chains of the amino acids coded by 61 codons, in the two views described above (“activation key” on and off) and fitting the degeneracy structure. As we said earlier, four amino acids are in their charged state. Next, we consider the three following symmetries of the genetic code, as we did in [1]: (i) Rumer’s symmetry [6], (ii) the Findley–Findley–McGlynn third-base symmetry [7] (see also [8]), and (iii) the Rosandić and Paar “ideal” symmetry and “supersymmetry” [9,10]. For each one of these symmetries, we use our Fibonacci-like sequences and their properties to fit their hydrogen atom and atom patterns. This is shown in Section 4, Section 5 and Section 6, respectively, as well as in the two views mentioned above. In Section 7, we return to the special amino acid proline and derive, from a few elements from modular arithmetic, its virtual “double” structure. In Section 8, we use our sequences again to show that they could also be applied to describe not only the multiplet structure of the standard genetic code but also one of the non-standard genetic codes as well. An illustration is given.

2. Fibonacci-like Sequences

In this section, we briefly summarize the essential elements of a set of Fibonacci-like sequences, the same as those used in our reference [1], which we shall use again in this paper for new applications. These sequences are defined, in terms of the ordinary Fibonacci sequence, by the recurrence relation ( n 2 )
S n p F n 1 + q F n 2
where s n denotes collectively the five sequences, named in the sequel a n , a n , b n , c n , and g n . In Table 4 below, the first few terms are given.
The choice of the “seeds”, or initial conditions (p, q) of these sequences, has been shown to be especially appropriate and very useful in their consequences in [1]; see, on this subject, Sections 3 and 4.2.5 of the latter reference. As we shall see in this study, these sequences will also be crucial in opening up new application possibilities. It is important to note that the Fibonacci and Lucas sequences can be obtained as a secondary product of the sequences a n and a n . The difference
a n a n 1 ,
gives the (slightly modified) Fibonacci sequence denoted F n ,
F n : 1 , 0 , 1 , 1 , 2 , 3 , 5 , 8 , 13 , 21 , 34 , ; n = 1 , 2 , 3 ,
in an unusual but interesting form: its “seeds”, here, are inverted with respect to the usual Fibonacci sequence. Also, the sum of any of its first members until a certain index gives a Fibonacci number, exactly, contrary to the ordinary Fibonacci sequence with seeds 0, 1 which always gives one unit less than a Fibonacci number. For example, in our case, for n = 10 , we obtain 1 10 F n = 55 . Moreover, the relation
L n = F n + F n + 2
gives the Lucas sequence:
L n : 2 , 1 , 3 , 4 , 7 , 11 , 18 , 29 , 47 , 76
It is important to note that the sequences in Table 4 are intertwined by a (large) number of identities connecting them (see Equation (2) in [1] for some of them). The reader can consult Appendix C of the latter reference to see how it is possible to check these identities for any large or very large values of the index n by using a computer with mathematical software containing a built-in Fibonacci function. For the low values of the index n in Table 4, the verification could be carried out immediately by hand or using a pocket calculator. We shall also use some of these identities in our applications in this paper, as we successfully did in our above-mentioned recent paper. The identities we need will be presented as we go along, in the appropriate place, where we use them for the first time.

3. Hydrogen Atom Content

In this section, we use the Fibonacci-like sequences defined in the preceding section to derive the hydrogen atom content in the side chains of the amino acids encoded by 61 codons. Also, as explained in the Introduction, we consider that four amino acids are charged and the side chain of proline can have, for the calculations in this section, either 5 hydrogen atoms in its side chain, in the “on” situation, or 6 ( = 5 + 1 ) in the “off” situation (see the Introduction and Table 3).

3.1. Hydrogen Atom Content: “Activation Key” On

In this case, we count, from Table 3, the number of hydrogen atoms:
21 × 4 + 22 + 1 × 6 + 50 + 1 1 1 × 2 + 9 × 3 + 7 + 8 = 362
(We have used the pre-calculated sums mentioned above in Table 3 and included the charges where they are necessary.) This number could be computed from our Fibonacci-like sequence a n and using the identity
1 k a n = a k + 2 6
For k = 9 , we have, isolating the last term a 9 ,
1 8 a k = 219 + a 9 = 219 + 139 = 364 6
As 6 is a perfect number (equal to the sum of its proper divisors), we have 6 = 1 + 2 + 3. By leaving the even number 2 at the right, transferring the odd numbers 1 and 3 to the left and arranging, we obtain
219 + 3 + 139 + 1 = 222 + 140 = 362
We have here the correct distribution of the hydrogen atom pattern in the “ 23 + 38 ” codon pattern, to be compared with what the data of Table 3 give (see the last rows in the table): 21 + 22 + 1 × 2 + 50 + 1 1 1 + 9 + 7 + 8 = 140 , in the “23” part (the sextets counted twice) and 21 × 3 + 22 + 1 × 4 + 50 + 1 1 1 + 9 × 2 = 222 in the “38” degeneracy part.(See more about this pattern in [1].)

3.2. Hydrogen Atom Content: “Activation Key” Off

In this case, proline has one more hydrogen atom in its side chain and we have from Table 3:
( 21 + 1 ) × 4 + 22 + 1 × 6 + 50 + 1 1 1 × 2 + 9 × 3 + 7 + 8 = 366
Here, we use the identity connecting the sequences a n and b n :
a n + b n + 1 = a n + 4
For n = 4 , we have 8 + 53 = 61 . Multiplying both sides by 6, we have
6 × 8 + 6 × 53 = 6 × 61 = 366
It suffices now to use the recurrence relation of b n twice ( 53 = 31 + 22 , 31 = 22 + 9 ) and arrange, to finally obtain
6 × 22 + 1 × 9 + 5 × 9 + 6 × 22 + 6 × 8 = 141 + 225 = 366
which is the desired result (see Table 3 and its last rows): 21 + 1 + 22 + 1 × 2 + 50 + 1 1 1 + 9 + 7 + 8 = 141 , in the “23” part (the sextets counted twice) and ( 21 + 1 ) × 3 + 22 + 1 × 4 + 50 + 1 1 1 + 9 × 2 = 225 in the “38” degeneracy part.
We can also compute the hydrogen atom content of the amino acid side chains in the different groups of multiplets (those in Table 1). Consider, first, the case of “activation key” on. From Table 3, we have
21 × 4 + 22 + 1 × 6 + 50 + 1 1 1 × 2 + 9 × 3 + 7 + 8 = 84 + 138 + 98 + 27 + 7 + 8 = 362
These numbers are, respectively, the number of hydrogen atoms in the side chains of the quartets, the sextets, the doublets, the triplet, methionine, and tryptophan. To compute these numbers by using our Fibonacci sequences, let us rewrite the sum in Equations (8) and (9) above as (see Table 4)
1 + 6 + 7 + 13 + 20 + 33 + 53 + 86 + 3 + ( 139 + 1 ) = 362
and use the following identity:
a n b n 2 = 2 F n 5 .
which, for n = 7 and 8, gives, respectively, 86 84 = 2 and 139 137 = 2 . By inserting the numbers 86 and 139 in the above relation, we have, by grouping:
13 + 33 + 53 + 7 + 20 + 6 + 1 + 84 + 2 + 3 + 2 + ( 137 + 1 ) = 362
It just remains to write the number 7 , in the first parentheses, as 8 1 from the recurrence relation of the sequence a n , that is, a 2 + a 3 = a 4 ( 1 + 7 = 8 ), to finally obtain
98 + 27 + 84 + 7 + 8 + 138 = 362
which are the numbers of hydrogen atoms in the five multiplets described above in Equation (14). In the second case, “activation key” off, we start from the identity 6 ( a n + b n + 1 ) = 6 a n + 4 , see Equation (11); the multiplication by the factor 6 does not change it. We have
6 × 61 = 6 × 23 + 38 = 6 × 23 + ( 6 × 23 + 6 × 7 + 6 × 8 ) = 366
where we have used the recurrence relation for the sequence a n thrice ( 61 = 23 + 38 , 38 = 23 + 15 , 15 = 7 + 8 ) . Arranging, we obtain, using also 8 = 7 + 1 ( a 4 = a 3 + a 2 ) ,
6 × 23 + 2 × 23 + 6 × 7 + 4 × 23 + 1 × 6 + 6 × 7 = 366
The last term, 6 × 7, a bit whimsical, could be handled as follows. The Fibonacci-like sequences, we have defined, could be continued to reach negative values of their indices, as is the case for the usual Fibonacci/Lucas sequences and for any other sequence of the same kind; this is well known. Now, here, we make only an appeal to the first term of this continuation, here, the value a 0 = 5 (see Table 4). It is not shown in this table but one could easily see it and understand that a 0 + a 1 = 5 + 6 = a 2 = 1 or 6 = 5 + 1 . We therefore write the said term, 6×7, as 5 × 7 + 7 = 5 × 4 + 5 × 3 + 7 , because 7 is a Lucas number ( 7 = 4 + 3 ). Finally, 5 × 3 = 15 = 7 + 8 by virtue of the recurrence relation a 5 = a 3 + a 4 . Ultimately, we end up with ( 5 × 4 + 7 = 27 ) .
138 + 88 + 98 + 7 + 8 + 27 = 366
which could be compared with the result obtained from Table 3:
21 + 1 × 4 + 22 + 1 × 6 + 50 + 1 1 1 × 2 + 9 × 3 + 7 + 8 = 88 + 138 + 98 + 27 + 7 + 8 = 366

4. Rumer’s Symmetry

Rumer’s symmetry [6] is defined by the transformation U G , A C . It divides the genetic code 8 × 8 table into two equal parts of 32 codons each, called here M 1 and M 2 . In Table 5, below, we show such a division. The eight quartets of codons (eight family boxes; see Section 1.1) that make up the set M 1 , which has a grey background, have, each, the same two first bases and code for the same amino acid; the third base being inconsequential. Three of the eight quartets in this set—serine, arginine, and leucine—correspond to the quartet portions of the three sextets. Group-I amino acids (two singlets), group-II amino acids (nine doublets), group-III amino acids (one triplet), and three stops, or termination codons, are included in the set M 2 . Note, importantly, that the latter encompasses also the three doublet portions of the three sextets. The point here, concerning symmetry, is that the sets M 1 and M 2 are exchanged under Rumer’s transformation, which is applied to all three bases.

4.1. The Hydrogen Atom Content

In this section, we compute the hydrogen atom content in the two Rumer’s set M 1 and M 2 , using our Fibonacci-like sequences, and compare that with Table 3.

4.1.1. “Activation Key” On

We have, from Table 3 (see the last row in the table):
M 1 :   21 × 4 + 22 + 1 × 4 = 176 M 2 :   22 + 1 × 2 + 50 + 1 1 1 × 2 + 9 × 3 + 7 + 8 = 186
with a total of 362. Now, we use Equation (8) of Section 3.1 again and write it in the form
1 7 a n + a 8 + a 9 = 133 + 53 + 2 × 86 = 186 + 2 × 86 = 364 6
As we did before, we use the fact that 6 is a perfect number ( 6 = 1 + 2 + 3 ) to bring the above relation to the final form, to be compared with Equation (23) above:
186 + 2 × 86 + 1 + 3 = 186 + 176 = 364 2 = 362

4.1.2. “Activation Key” Off

Table 3 gives, in this case,
M 1 :   ( 21 + 1 ) × 4 + 22 + 1 × 4 = 180 M 2 :   22 + 1 × 2 + 50 + 1 1 1 × 2 + 9 × 3 + 7 + 8 = 186
With a total of 366 hydrogen atoms. Here, we use Equation (12) of Section 3.2 again:
6 × 8 + 6 × 53 = 6 × 61 = 366
and simply introduce the recurrence relation 53 = 31 + 22 of the sequence b n , see Table 4, to obtain
6 × 8 + 22 + 6 × 31 = 180 + 186 = 366
which describes the two hydrogen atom values in Equation (26) above.

4.2. The Atom Content (CHNOS)

4.2.1. “Activation Key” On

From Table 3, we have
M 1 :   31 × 4 + 35 + 1 × 4 = 268 M 2 :   35 + 1 × 2 + 96 + 1 1 1 × 2 + 13 × 3 + 11 + 18 = 330
With a total of 598 atoms. To describe this atom pattern, we use three ingredients: (i) elements of the sequence g n , (ii) the relation 358 + 4 = 362 , from Equation (7) in Section 3.1, and (iii) the identity
b n + g n = 6 a n
This latter identity, for n = 9 , gives 358 + 236 = 594 . Inserting the number 358 , from the relation in the line above Equation (30), gives 362 4 + 236 = 594 or 362 + 236 = 598 . Finally, by adding and subtracting the quantity 1 5 g n = 94 , computed from Table 4, on the left-hand side, we obtain
362 94 + 236 + 94 = 268 + 330 = 598
This is the desired result.

4.2.2. “Activation Key” Off

In this case, we have from Table 3 (see also the last rows in the table):
M 1 :   ( 31 + 1 ) × 4 + 35 + 1 × 4 = 272 M 2 :   35 + 1 × 2 + 96 + 1 1 1 × 2 + 13 × 3 + 11 + 18 = 330
with a total of 602 atoms. This case could be handled by using the following identity:
4 a n + b n + 1 2 F n 6 = 7 a n
where F n is the Fibonacci sequence defined in Equations (2) and (3). For n = 8 , we have
4 × 61 + 358 2 × 0 = 7 × 86 = 602
By using the recurrence relation of the sequence b n twice, 358 = 84 + 2 × 137 and, next, replacing 84 by 86 2 from the identity in Equation (16) of Section 3.2 for n = 7 , we obtain
4 × 61 + 86 + 2 × 137 2 = 330 + 272 = 602
The numbers on the right-hand side therefore correctly describe the pattern above for M 2 and M 2 , respectively.

5. The 3rd-Base Symmetry Classification

By considering the genetic code as an f-mapping, Findley et al. [7] extracted a basic symmetry for the doubly degenerate codons (group-II). Some excerpts from the aforementioned reference are in order for understanding what an f-mapping is. The first, second, and third bases in a codon are denoted by the letters i, j, and k (B stands for bases U, C, A, and G). The authors consider the 64-codon set, C , and define C k = C i j k C | i , j B , k B where i, j, k designate the first, second, and third bases in the codon C i j k (B is for bases U, C, A, G). C k , k   B , partitions C into four separate subsets where each subset contains only codons having the same third base. Each of these subsets is mapped by f onto members of the amino acids set A , with the image being denoted f C k ; this is shown in Table 6, below.
Therefore f C U = f C C and f C A f C G . With this f-mapping, a one-to-one correspondence is established between one member of a doubly degenerate codon pair and the other member. Equivalently, these relationships could be rephrased as follows: (i) if a codon for an amino acid has third base U, then there is a codon for the same amino acid having third base C and the other way round or (ii) if a codon of an amino acid has third base A, then there is a codon of the same amino acid having third base G and the other way round. For a doubly degenerate codon pair, (i) and (ii) are mutually exclusive. For the quartets (group-IV), (i) and (ii) hold simultaneously. For the sextets (group-VI), the quartet part obeys (i) and (ii) and, for the doublet part, one has (i) or (ii). For the odd-order degenerate codons (group-Iand group-III), however, there is a small deviation from symmetry. In Table 6, we show this classification. In the last two rows of this table, we have calculated, from Table 3, the hydrogen atom content and the atom content in the side chains of the amino acids in the four columns, in the two views “on” and “off” (see Section 1.2). Note the hydrogen atom balances ( 2 × 84 , 2 × 85 ) and atom number balances ( 2 × 144 , 2 × 145 ) in the last two rows in Table 6. These express the exact one-to-one correspondence mentioned above (here, the two codons of isoleucine AUU and AUC constitute an order-2 doublet). These balances will be established from our Fibonacci-like sequences below in this section.

5.1. The Hydrogen Atom Content

5.1.1. “Activation Key” On

In the U/C third-base set, there are 2 × 84 hydrogen atoms. In the A/G third-base set there are, respectively, 94 and 100 hydrogen atoms (grand total of 362 , see Table 6 above). To describe this pattern, using our Fibonacci-like sequences, let us start again from Equation (24) of Section 4.1.1 and write it in the following form, by expliciting the sum
1 + 6 + 7 + 13 + 20 + 53 + 33 + 53 + 2 × 2 + 4 + 2 × 84 = 100 + 94 + 2 × 84 = 362
Note that we have included the sixth term of the sequence a 6 = 33 , in the sum 1 7 a n , in the second parentheses. In this way, we reach the correct hydrogen atom pattern.

5.1.2. “Activation Key” Off

In this case, let us recall Equation (27) of Section 4.1.2 (or Equation (12) of Section 3.2 which is the same)
6 × 8 + 22 + 6 × 31 = 180 + 186 = 366
and use the following identity linking the sequences a n and b n
a n + a n + 2 = b n
which, for n = 4 , is written 8 + 23 = 31 . By inserting this last number, 31, in the above equation and arranging, in a first step, we have
6 × 8 + 22 + 2 × 8 + ( 4 × 8 + 6 × 23 ) = 180 + 186 = 366
The second parentheses on the left-hand side can be written as 2 × 2 × 8 + 3 × 23 = 2 × 85 . This is the correct pattern for the U/C third-base set and the other part in the above equation remains to be handled. A quick way consists in writing the factor 2 × 8 above as 8 + 8 = 8 + 3 + 5 as 8 is a Fibonacci number. All this allows us to put the above equation in the following form:
3 × 8 + 22 + 8 + 3 + 3 × 8 + 22 + 5 + 2 × 85 = 101 + 95 + 2 × 85 = 366
which could be compared with the data in Table 6 (case “off”).

5.2. The Atom Content

5.2.1. “Activation Key” On

Let us, here, start from Equation (30) in Section 4.2.1, written as
6 a 9 + 4 = 6 × 99 + 4 = 598
and use, first, in cascade the recurrence relation of the sequence a n :
6 × ( 38 + 23 + 23 + 15 ) + 4 = 598
Now, we arrange this relation as follows:
2 × 3 × 38 + 2 × 15 + 6 × 23 + 2 × 15 + 6 × 23 + 4 = 2 × 144 + 6 × 23 + 2 × 15 + 6 × 23 + 4
To obtain the correct atom number pattern, we note that because of the following identity of the sequence a n :
1 k a n = a n + 2 1
we can, for k = 4 , write 6 + 1 + 7 + 8 = 22 = 23 1 or 23 = 22 + 1 . By inserting this latter value in Equation (43) above, we obtain
2 × 144 + 6 × 22 + 15 + 6 + 15 + 6 × 23 + 4 = 2 × 144 + 147 + 163 = 598
We recognize here the correct atom number pattern (see Table 6).

5.2.2. “Activation Key” Off

This case is easily handled by starting from Equation (34) of Section 4.2.2. Using the recurrence relation of the sequence b n ( 137 = 84 + 53 ), we write it as
4 × 61 + 84 + 2 × ( 84 + 53 ) = 602
Next, we use, again, the identity a n + a n + 2 = b n , already considered in Section 5.2.1, but now for n = 6 : 23 + 61 = 84 . By inserting this relation in the equation above, we have
2 × 2 × 61 + 23 + 2 × 53 + 2 × 61 + 84 = 602
As the first term is already correct, we examine the second. Using the recurrence relations of both sequence b n and a n , we can write 53 = 22 + 31 = 2 × 22 + 9 and 61 = 38 + 23 = 23 + 2 × 15 + 8 . By inserting these values in the equation above, we end up with
2 × 2 × 61 + 23 + 4 × 22 + 4 × 15 + 2 × 9 + 2 × 23 + 2 × 8 + 84 = 2 × 145 + 148 + 164 = 602
which is the correct answer.

6. The “Ideal” Symmetry and the “Supersymmetry” Classification Schemes

In the “ideal” symmetry classification scheme [9], the three sextets serine, arginine, and leucine, each of them encoded by six codons, are used as “generators”, with serine playing the central role. These three objects are underlined in Table 7 below. This approach separates the 64-codon matrix into two groups, the “leading” group and the “non-leading” group, each of which has 32 codons. The (equal) A+U-rich and the G+C-rich parts make up each group. The “ideal” classification scheme is engendered by combining the six codons of serine, arginine, and leucine in the following way. The entire “leading” group (consisting of 32 codons) is defined by the initial generator, serine, which has six codons; arginine which, too, has six codons; and leucine, which has only the quartet part of its six codons. On the other hand, the leftover doublet portion of leucine serves as a “seed” for the creation of the 32-codon “non-leading” group. According to this scheme, the genetic code table is produced by codon sextets based on exact purine/pyrimidine symmetries, A+U-rich/C+G-rich symmetries, and direct/complement symmetries (see [9]). The table below shows these groups.
Soon after the publication of the paper [9], the authors postulated, in [10], the existence of what they call a “supersymmetric” genetic code table, derived from the “ideal” symmetry genetic code table, having now five symmetries between bases, codons, and amino acids. These are purine–pyrimidine between bases and codons, direct–complement symmetry of codons between boxes, A+U-rich and C+G-rich symmetry of codons between two columns, and mirror symmetry between all purines and pyrimidines of the whole code and between second and third bases of codons (see [10]). This “supersymmetry” genetic code table is shown in Table 8.

6.1. Hydrogen Atom Content

6.1.1. “Activation Key” On

The hydrogen atom count is as follows, from Table 3 and Table 8: leading group (in yellow and orange, as in Table 7): 192; non-leading group (in light grey and light blue, as in Table 7): 170. To derive this hydrogen atom pattern, let us start from Equation (25) of Section 4.1.1 and use again the equality 86 = 84 + 2 (from the identity in Equation (16) of Section 3.2 for n = 7 ) to obtain, after arranging,
( 186 + 4 + 2 ) + 2 × 84 + 2 = 192 + 170 = 362
which is the correct result.

6.1.2. “Activation Key” Off

In this case, the hydrogen atom count is as follows: leading group: 192, non-leading group: 174. Here, we start from Equation (27) of Section 4.1.2:
6 × 8 + 6 × 53 = 6 × 61 = 366
In this case, we consider, first, the number 8 and use the recurrence relation of the sequence a n to write it as 8 = 7 + 1 and, next, use the recurrence relation of b n 53 = 22 + 31 . With these elements, we could write Equation (50) as follows:
6 × 1 + 31 + 6 × 22 + 7 = 192 + 174 = 366
This is the correct result.

6.2. Atom Content

6.2.1. “Activation Key” On

From Table 3 and Table 8, we have 316 atoms in the leading group and 282 atoms in the non-leading group. Here, we start from the relation 362 + 236 = 598 , which led to Equation (31) of Section 4.2.1 but, this time, we add and subtract the quantity 1 6 a n = 80 , see Table 4, to obtain the correct result:
362 80 + 236 + 80 = 282 + 316 = 598

6.2.2. “Activation Key” Off

In this case, the atom number in the leading group is the same as before (316) but the atom number in the non-leading group is now equal to 286. This case could be handled by appealing to the identity in Equation (33) of Section 4.2.2, which is again written for n = 8
4 × 61 + 358 2 × 0 = 7 × 86 = 602
We first write 358 as 84 + 2 × 137 , as in Section 4.2.2, but we now (i) select one copy of the number 61 in the above relation and write it as 23 + 38 , by virtue of the recurrence relation of the sequence a n , and (ii) use the identity in Equation (16) ( a n b n 2 = 2 F n 5 ) for n = 8 , that is, 139 137 = 2 . This allows us to put Equation (53a) above in the form:
2 × 139 + 38 + 84 + 3 × 61 + 23 2 × 2 = 316 + 286 = 602
which is the correct result.

6.3. The “Supersymmetry” Genetic Code Table

As the case of the “supersymmetry” genetic code table [10] has not been considered in [1], where the 20 amino acids were all taken in the their uncharged state and proline’s side chain considered in shCherbak’s view (5 hydrogen atoms, 8 atoms, and 41 nucleons), we give, here, the corresponding results and, next, consider the case where the four amino acids mentioned earlier are charged and proline with its two views, on and off.

6.3.1. Uncharged Amino Acid Case and “Activation Key” On Only

Consider, first, the identity
g n + a n + 2 + 2 b n 1 = c n + 2 b n 1
where we have added to both sides the same quantity 2 b n 1 . For n = 7 , we have from Table 4
91 + 99 + 2 × 84 = 190 + 2 × 84 = 358
The sum 190 + 2 × 84 = 358 , describing the leading group/non-leading group hydrogen atom pattern, has already been obtained in [1] but the (new) quantity 91 + 99 + 2 × 84 will be useful in what follows. Using again the identity in Equation (16) for n = 7 ( 84 = 86 2 ) and next the identity in Equation (7) of Section 3.1 for n = 6 , which gives 80 = 86 6 , we can put the left-hand side of Equation (55) in the form
91 + 99 + ( 80 + 88 )
If we take the number 91, the 7th term of the sequence g n , 91 = 37 + 54 , and write it as 54 + 2 × 17 + 3 = 88 + 3 , because 17 = 20 3 in the same sequence, we then have, from Equation (56),
2 × 88 + 99 + 3 + 80 = 176 + 182
This is the direct box/complement box hydrogen atom pattern, respectively (see Table 8). (The calculations from this table go along the same lines as in the above sections. For the direct boxes, for example, take all the amino acids inside all of them and, taking into account their number of codons, compute the number of hydrogen atoms, and the same for the complement boxes). To derive the hydrogen atom pattern for the mirror symmetry, a more elegant and quick way is as follows. Consider the identity
g n + b n 3 = 2 a n + 1
For n = 7 , we have 91 + 31 = 2 × 61 (see Table 4). By inserting this last relation in Equation (56) above, we obtain
2 × 61 + 88 + 99 + 80 31 = 210 + 148
This is the hydrogen atom pattern for the “mirror” symmetry (see Table 8 above. See also Figure 2 in [10] and the detailed explanations therein about this beautiful symmetry).

6.3.2. Charged Amino Acid Case, “Activation Key” On and Off

Now, we consider the case where (four) amino acids are in their (physiological) charged state which is the main subject in this paper.

Hydrogen Atom Content

In the case of “activation key” on, there are 174 hydrogen atoms in the direct boxes and 188 hydrogen atoms in the complement boxes (from Table 3 and Table 8). Here, we recall Equation (25) of Section 4.1.1:
186 + 2 × 86 + 4 = 364 2 = 362
By using again the identity in Equation (16) for n = 7 , 84 = 86 2 , once, and arranging, we obtain
186 + 2 + 86 + 84 + 4 = 188 + 174 = 362
which is the correct result. In the case of “activation key” off, there are 178 hydrogen atoms in the direct boxes and 188 hydrogen atoms in the complement boxes. Here, we start from Equation (12) of Section 3.2 and write it as
6 × 8 + 6 × ( 22 + 31 ) = 6 × 61 = 366
where 53 = 22 + 31 from the recurrence relation of the sequence b n . Next, we use the same identity in Equation (38) of Section 5.1.2, again for n = 4 ( 31 = 23 + 8 ), to rewrite (one copy) of the number 31 above:
6 × 8 + 6 × 22 + 8 + 5 × 31 + 23 = 188 + 178 = 366
These are the correct hydrogen atom numbers mentioned above. Now, we look at the “mirror” symmetry. In the case of “activation key” on, there are 208 hydrogen atoms in column 1 and 154 hydrogen atoms in column 2 of Table 8, using the data of Table 4. Here, we start from Equation (60) above and put it in the following correct form:
186 + 22 + 31 + 33 + 86 + 4 = 208 + 154 = 362
where we have used the recurrence relation 86 = 53 + 33 of the sequence a n and, next, replaced the number 53 of the latter sequence by the same number 53 of the sequence b n which is equal to 22 + 31 . (Recall that, from Equation (16), one has a 7 b 5 = 53 53 = 2 F 2 = 2 × 0 = 0 . )
In the case of “activation key” off, there are 208 hydrogen atoms in column 1 and 158 hydrogen atoms in column 2 (see Table 8, data from Table 4). Consider again Equation (60) above:
6 × 8 + 6 × 22 + 31 = 366
By using, repetitively, the recurrence relation of the sequence b n and also the following relation 22 = 15 + 7 , from the identity a n + a n + 2 = b n for n = 3 , we can put the equation above into the form:
11 × 13 + 15 + 17 × 9 + 7 + 6 × 8 = 158 + 208 = 366
which is the correct answer.

Atom Content

In the case of “activation key” on, there are 300 atoms in the direct boxes and 298 atoms in the complement boxes with a total of 598 (see Table 8 and data from Table 4). In this case, we start from the relation
6 a n + 4 = 6 × 99 + 4 = 598
(see Equation (30) and below, n = 9 ). It is now enough to write 4 = 3 + 1 , as a Lucas number, for example, and rewrite the above equation in the form
3 × 99 + 1 + 3 × 99 + 3 = 298 + 300 = 598
which correctly describes the above atom content numbers. In the case of “activation key” on, there are 348 atoms in column 1 and 250 atoms in column 2 (see Table 8, data from Table 4). Here, we start from Equation (66) above and use the identity in Equation (11), a n + b n + 1 = a n + 4 with n = 5 ( 99 = 15 + 84 ) . We have
6 × 84 + 15 + 4 = 598
By introducing the identity in Equation (16) with n = 7 , 84 = 86 2 , and arranging, we finally obtain the above correct atom numbers:
4 × 86 + 4 + 6 × 15 + 2 × 84 4 × 2 = 348 + 250 = 598
In the case of “activation key” off, there are 304 atoms in the direct boxes and 298 atoms in the complement boxes, with a total of 602 atoms (see Table 8, data from Table 4). To describe this case, we start by writing Equation (34) of Section 4.2.2 as follows:
4 × 61 + ( 137 + 221 ) = 7 × 86 = 602
Now we, first, take one copy of the number 61 and write it as 53 + 8 , using the identity a n + b n + 1 = a n + 4 with n = 4 ( 61 = 8 + 53 ) . Second, we write each of the other three copies of 61 using the recurrence relation 61 = 38 + 23 . Inserting these values in Equation (71), we obtain
3 × 38 + 53 + 137 + 8 + 3 × 23 + 221 = 304 + 298 = 602
which is what we are looking for.
In the case of “activation key” off there are 348 atoms in column 1 and 254 atoms in column 2 (see Table 8, data from Table 4). It is possible to show that this case follows from the preceding one by noticing, as we did in the derivation of Equation (64) above, that the number 53 = a 7 is equal to b 5 = 53 (these sequences are linked, see Equation (16)). By using the recurrence relation a 7 = 53 = a 6 + a 5 = 33 + 20 and arranging, we finally have the following right answer:
3 × 38 + 20 + 137 + 8 + 3 × 23 + 33 + 221 = 348 + 254 = 602

7. More on shCherbak’s Theory

In [1], we derived the relation
115 = 41 + 74 = 42 + 73
This describes proline’s singularity (see [3,4]). Here, in this section, we go much further, by presenting completely new results. First, consider, once again, the sequence a n , more exactly a 7 = 38 . We have, by squaring:
a 7 2 = 1444
It is not difficult to see, from Table 3, that this number corresponds to the number of nucleons (or integer molecular mass) in the side chains of the amino acids coded by 23 codons, where the sextets are counted twice, and proline has 42 nucleons in its side chain and only 73 nucleons in its backbone, contrary to the other 19 amino acids having 74 nucleons in their backbones, see Equation (74) above and Section 1.2. Second, from the identity 1 k a n = a n + 2 1 , already considered in the sections above, we can write Equation (75) as follows, using n = 5 twice:
a 7 2 = 38 2 = 38 × 37 + 1 = 38 × 37 + 37 + 1 = 1443 + 1
We recognize here the unit corresponding to the “singular” nucleon and the 1443 nucleons where proline, now, has 41 nucleons in its side chain and 74 nucleons in its backbone as do the 19 other amino acids. Third, we can indeed derive the very molecular mass of proline from the above numbers of nucleons 1443 and 1444 . To see this, we use another tool from number theory, i.e., modular arithmetic, which has many applications in mathematics (group theory, knot theory, ring theory) and computer science (computer algebra, coding theory, cryptography, and so on), see, for example, [11]. Also, several kinds of moduli are used in applications, for example, modulo 11 in international standard book numbers (ISBNs) or mod 37 and mod 97 arithmetic in error detection in bank account numbers. We shall, here, take as moduli the integers 99 and 999 . (This is equivalent to summing the “digits” in base-100 and base-1000, respectively.) We have
1443   m o d 99 + 1444   m o d 99 = 57 + 58 = 115
The reader could use, if desired, quick online calculators for the modulo function, for example, [12]. Using the trick of the digit summation mentioned above ( 57 = 14 + 43 and 58 = 14 + 44 ) , we can arrange the above relation as 115 = 43 + 72 . In what follows, we shall use two functions from elementary number theory, Euler’s φ-function of an integer n (also known as Euler’s totient function), which counts the number of positive integers less than or equal to n which are relatively prime to n [13], and also the σ-function which gives the sum of the divisors of an integer n [14]. In the case where the integer is a prime number p, these functions simplify greatly and one has simply φ p = p 1 and σ p = p + 1 . Noting that 43 above is the only odd number out of four (43, 14, 14, 44) and, furthermore, a prime “digit” (remember we are in base-100), we obtain by calling its φ-function ( φ ( 43 ) = 43 1 = 42 ): 115 = 42 + 72 + 1 = 42 + 73 We have also 41 + 73 + 1 if we use σ 41 = 41 + 1 = 42 . These are the same relations as in Equation (74) above. The numbers 1443 and 1444 are useful, as explained above, but there is also a third number which will not only play a role together with the other two but also has a meaningful interpretation. It is given by the following relation:
1444 + 1444   m o d 1443 = 1444 + 1 = 1445
This number corresponds to the number of nucleons in the side chains of the amino acids encoded by 23 codons (the sextets counted twice) with proline’s side chain having 42 nucleons and four amino acids are in their charged state (see Section 1.2, Table 3 and above it):
145 + 1 + 188 + 1 × 2 + 660 + 1 1 1 + 57 + 130 + 75 = 1445
In the first parentheses, 1 corresponds to the supplementary nucleon in proline’s side chain. In the second parentheses, 1 corresponds to the charged arginine. In the third parentheses, the units correspond respectively to lysine (charge +1), aspartic acid (charge −1), and glutamic acid (charge −1). We have therefore three meaningful numbers: 1443 , 1444 , and 1445 . From these, we consider the following expression:
1443   m o d   999 + 1444   m o d   999 + 1445   m o d   999 = 444 + 445 + 446 = 1335
and take its a 0 -function, the sum of its prime factors ( 1335 = 3 × 5 × 89 ), see below about this function.
a 0 1335 = 3 + 5 + 89 = 97
This number is equal to the number of nucleons (or molecular mass) of the residue of proline (see [5], Table 1). When two amino acids (or more) combine to form a peptide, a water molecule (two hydrogen atoms and one oxygen atom) is released and what remains of each amino acid is called a residue. Here, we have 115 97 = 18 ( = 115 m o d 97 ) , which is the molecular mass of the water molecule. Note that we have also, using two of the above numbers, 444 and 445:
444   m o d 99 + 445   m o d 99 = 48 + 49 = 97
Both relations give the same result, 97. From Equations (81) and (82), we have the two-fold result
444   m o d 99 + 445   m o d 99 + 115   m o d   97 = 97 + 18 = 115 a 0 1335 + 115   m o d   97 = 97 + 18 = 115
Finally, it is also possible to derive the detailed atomic composition of the (whole) molecule of proline: C 5 H 9 O 2 N . Starting from Equation (81) and then adding the quantity 115   m o d 97 = 18 = 2 × 9 ,
a 0 1335 + 115   m o d   97 = 3 + 5 + 89 + 18 = 115
Now, 89 , as a Fibonacci number, could be decomposed successively as 55 + 34 and 55 + 21 + 13 = 55 + 13 + 13 + 8 = 55 + 13 + 5 + 8 + 8 . By inserting this decomposition in the above equation and arranging, we have
5 + 55 + 5 + 9 + 3 + 13 + 8 + 8 + 9 = 60 + 14 + 32 + 9 = 115
This is the correct result. The number 60 has the prime factorization 2 2 × 3 × 5 = ( 2 × 6 ) × 5 and gives five carbon atoms (carbon nucleus: six protons, six neutrons). The number 14 has the prime factorization 2 × 7 and corresponds to one nitrogen atom (nitrogen nucleus: seven protons, seven neutrons). The number 32 has the prime factorization 2 5 = 2 × 2 × 2 3 = 2 × ( 2 × 8 ) and corresponds to two oxygen atoms (oxygen nucleus: eight protons, eight neutrons). The last number, 9, corresponds to nine hydrogen atoms.
In order to fully understand the reasoning presented below, it is important for the reader to keep in mind that, when looking at Equations (77) and (80), 1443 represents the number of nucleons in the side chains of the amino acids coded by 23 codons with the sextets counted twice and proline having 41 nucleons in its side chain, while 1444 represents the number of nucleons in the side chains of the amino acids coded by 23 codons with the sextets counted twice and proline now having 42 nucleons in its side chain. In fact, it appears that there is compelling evidence that the calculations performed here are “locked” technically. Below, we show why but, before doing that, let us recall, briefly, a few elements of our helpful arithmetic function A 0 (see Appendix B in [1]). From the Fundamental Theorem of Arithmetic, an integer n can be represented, uniquely, as a product of prime numbers irrespective of their order: n = p 1 n 1 × p 2 n 2 × × p k n k . The function A 0 is defined by the formula A 0 n = a 0 n + S P I n + Ω ( n ) where a 0 n is the sum of the prime factors (including the multiplicities) p 1 × n 1 + p 2 × n 2 + + p k × n k , S P I n is the sum of the prime indices of the prime factors (including the multiplicities) P I ( p 1 ) × n 1 + P I ( p 2 ) × n 2 + + P I ( p k ) × n k , and Ω ( n ) , the so-called Big Omega function, is the number of prime factors n 1 + n 2 + + n k . The portion a 0 n of this function was already involved in the derivation of Equation (81) above.
Now, let us look at the moduli 99 and 999 which were together with the numbers 1443 and 1444 and critical in the derivation of Equations (77), (80), and (82). Their prime factorization is given by 99 = 3 2 × 11 and 999 = 3 3 × 37 . We have A 0 99 = 29 and A 0 999 = 68 and, therefore, A 0 99 + A 0 999 = 29 + 68 = 97 . This is nothing but, again, the integer molecular mass of proline’s residue, see Equations (81) and (82). Also, by isolating the two terms P I 37 = 12 and Ω 37 = 1 , in A 0 999 , and including them in A 0 99 , we obtain 29 + 12 + 1 + 3 × 3 + 3 × 2 + 37 = 42 + 55 . This is a more accurate description of proline’s residue (see [5], Table 1), which could also be seen from Equation (81) above, remembering that 89 is a Fibonacci number, 3 + 5 + 89 = 3 + 5 + 34 + 55 = 42 + 55 . By pushing the precision to the extreme, we can arrange the side chain part as follows: 42 = 29 + 12 + 1 = 6 + 6 + 11 + 1 + 12 + 5 + 1 = 3 × 12 + ( 5 + 1 ) , where we have made explicit the portions of A 0 99 . We have three carbon atoms (atomic mass 12) and six hydrogen atoms, see the side chain in Figure 1 below. The last term is interpreted as six hydrogen atoms in the side chain, ( 5 + 1 ), with one hydrogen atom susceptible to being “transferred” from the side chain to the backbone (shCherbak’s “borrowing”, see above and Table 3). Of course, one has to add 18 , from Equation (83), the water molecule, to obtain the whole molecule of proline. Below, in Figure 1, we show it with the side chain boxed.
The unique charm and covert attraction of proline’s structure are concealed inside the integer molecule masses, just waiting to be gently revealed through the use of modular arithmetic.

8. Multiplet Structures

This section deals with another application of our Fibonacci-like sequences, more precisely, the sequences a n and a n . In [15], we have derived the exact multiplet structure of the genetic code, starting from the total number of codons, 64, expressed from the beginning as 8 × 8 and using Fibonacci/Lucas decompositions. We subsequently used either a property of “superperfect” numbers or the relation between Fibonacci and Lucas numbers to write one factor 8 as 7 + 1 and next 7 as 3 + 4 to derive the above-mentioned multiplet structure. Here, we show that all the ingredients of this derivation are, in fact, already ostensibly embedded in our Fibonacci-like sequences. Taking a 4 = 8 (see Table 4), first, there is the recurrence relation a 3 + a 2 = 7 + 1 = a 4 = 8 . This is the decomposition of the number 8 mentioned above, obtained here without recourse to “superperfect” numbers, for example. Next, from the Lucas sequence in Equation (4), L n = F n + F n + 2 , which is derived from the Fibonacci sequence F n in Equation (3), is itself derived from the sequences a n and a n in Equation (2), and we have 7 = 4 + 3 . This is all we need to write
a 4 × a 3 + a 2 = 8 × ( 4 + 3 + 1 )
which leads, after writing the Fibonacci number 8 as 5 + 3 , to the following multiplet structure of the (standard) genetic code which could be expressed in two equivalent forms, Equations (87) and (88):
5 × 4 + 3 × 4 + ( 9 × 2 + 3 × 2 + 3 + 2 + 3 ) = 64
5 × 4 + 3 × 4 + 2 + 9 × 2 + 3 + 2 + 3 = 64
The form in Equation (87) describes Rumer’s division (see Section 4): five quartets (four codons each) and three quartet parts of the three sextets (four codons each) in the first parentheses (set M 1 ), and nine doublets (two codons each), three doublet parts of the three sextets (two codons each), one triplet (three codons), two singlets (one codon each), and three stops (three codons) in the second parentheses (set M 2 ). The form in Equation (88) describes the usual multiplet structure: five quartets, three sextets (six codons each, 6 = 4 + 2 ), nine doublets, one triplet, two singlets, and three stops. The vertebrate mitochondrial genetic code could also be easily derived from Equation (88), see [1]. In fact, in unpublished notes, we have also derived from Equation (86), with a little work, several other multiplet structures of the (non-standard) genetic codes. Let us give, here, only one example: the Alternative Yeast Nuclear Code (#12 in the database [16]). In this code, shown in Table 9 below, the only change concerns the reassignment of the codon CUG of leucine which now codes for serine. We have therefore five quartets (V, A, T, P, G), one sextet (R), one quintet (L, UUR, CUY, CUA), one septet (S, UCN, AGY, CUG), nine doublets (F, Y, C, H, Q, D, E, N, K), one triplet (I), two singlets (M, W), and three stops. To describe this code, let us start from Equation (88) and rewrite it in the form
5 × 4 + 1 × 4 + 2 + 8 + 4 + 9 × 2 + 3 + 2 + 3 = 64
by selecting a factor 2 × 4 + 2 and developing it as 8 + 4 . Now, we write the Fibonacci number 8 as 8 = 5 + 3 = 3 + 2 + ( 2 + 1 ) and insert it in Equation (88). We have, writing 3 = 2 + 1 again,
5 × 4 + 1 × 4 + 2 + 1 + 2 + 2 + ( 4 + 2 + 1 ) + 9 × 2 + 3 + 2 + 3 = 64
This relation describes this code. Arginine, the term 1 × 4 + 2 , is now the only sextet left. The term 1 + 2 + 2 is suitable for the quintet leucine coded now by five codons: CUA (one codon), CUY (two codons), and UUR (two codons). The term ( 4 + 2 + 1 ) describes the septet serine coded now by seven codons: UCN (four codons), AGY (two codons), and CUG (one codon). The remaining terms are the usual ones (see above). The case of the other non-standard genetic codes could be handled along the same lines with, of course, some additional work.

9. Conclusions

We have once again studied the genetic code symmetries by taking an unexplored route. As previously mentioned, we recently used a small set of Fibonacci-like sequences that we designed to describe the symmetries of the genetic code [1]. However, this time, we thought of the amino acids as if they were submerged in a physiological environment (neutral pH), where four of them pick up a charge, either −1 (for aspartic acid and glutamic acid) or +1 (for arginine and lysine). The option examined in [4,5] is the same as this one. Additionally—and this is just as novel—we have examined two potential viewpoints for the unique amino acid proline, whose side chain is connected to its backbone twice: shCherbak’s view and the Downes–Richardson view, see Section 1.2. We have outlined the patterns for the hydrogen atom content and the atom content for Rumer’s symmetry, as well as this for the two viewpoints indicated above (referred to as “on” and “off” in the text), in Section 4.1 and Section 4.2 with these two newly considered components. The same work has been carried out for the third-base symmetry in Section 5.1 and Section 5.2 and the “ideal” symmetry as well as the more complex “supersymmetry” genetic code table in Section 6.1, Section 6.2 and Section 6.3. In Section 7, we have uncovered the remarkably unique chemical structure of proline along with its corresponding “activation” key, all with a basic application of modular arithmetic. Finally, we used our Fibonacci-like sequence a n once more in Section 8 to derive, in a new way, not only the exact number of amino acids, 20, and the multiplet structure of the standard genetic code (five quartets, three sextets, nine doublets, one triplet, two singlets, and three stops) but also, through an example, the exact multiplet structure of a non-standard variant of the genetic code, the Alternative Yeast Nuclear Code. For the other non-standard genetic codes, the strategy is analogous to the one adopted here, i.e., starting from Equation (87), or one of its variants obtained while treating a given non-standard version of the genetic code, at a given stage, and applying Fibonacci/Lucas decompositions and/or regrouping of the numeric factors. All the known non-standard versions of the genetic code, treated this way, will be the subject of a future publication, as another (new) practical application.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Négadi, T. Revealing the genetic code symmetries through computations involving Fibonacci-like sequences and their properties. Computation 2023, 11, 154. [Google Scholar] [CrossRef]
  2. Nirenberg, M.; Leder, P.; Bernfield, M.; Brimacombe, R.; Trupin, J.; Rottman, F.; O’Neal, C.N.A. Codewords and Protein Synthesis, VII. On the General Nature of the RNA Code. Proc. Natl. Acad. Sci. USA 1965, 53, 1161–1168. [Google Scholar] [CrossRef]
  3. shCherbak, V. The Arithmetical origin of the genetic code. In The Codes of Life: The Rules of Macroevolution; Barbieri, M., Ed.; Springer: New York, NY, USA, 2008; pp. 153–185. [Google Scholar]
  4. shCherbak, V.; Makukov, M. The “wow! Signal” of the terrestrial genetic code. Icarus 2013, 224, 228–242. [Google Scholar] [CrossRef]
  5. Downes, A.M.; Richardson, B.J. Relationships between genomic base content and distribution of mass in coded proteins. J. Mol. Evol. 2002, 55, 476–490. [Google Scholar] [CrossRef]
  6. Rumer, Y. About systematization of the genetic code. Dok. Akad. Nauk SSSR 1966, 167, 1393–1394. [Google Scholar]
  7. Findley, G.I.; Findley, A.M.; McGlynn, S.P. Symmetry characteristics of the genetic code. Proc. Natl. Acad. Sci. USA 1982, 79, 7061–7065. [Google Scholar] [CrossRef]
  8. Shu, J.J. A new integrated symmetrical table for genetic codes. Biosystems 2017, 151, 21–26. [Google Scholar] [CrossRef]
  9. Rosandić, M.; Paar, V. Codons sextets with leading role of serine create “ideal” symmetry classification scheme of the genetic code. Gene 2014, 543, 45–52. [Google Scholar] [CrossRef]
  10. Rosandić, M.; Paar, V. Standard Genetic Code vs. Supersymmetry Genetic code—Alphabetical table vs. physicochemical table. BioSystems 2022, 218, 104695. [Google Scholar] [CrossRef]
  11. Berggren, J.L. Modular Arithmetic. Encyclopedia Britannica. 2023. Available online: https://www.britannica.com/science/modular-arithmetic (accessed on 23 December 2023).
  12. Available online: https://www.calculatorsoup.com/calculators/math/modulo-calculator.php (accessed on 23 December 2023).
  13. Available online: https://t5k.org/glossary/page.php?sort=EulersTheorem (accessed on 23 December 2023).
  14. Available online: https://www.dcode.fr/divisors-list-number (accessed on 23 December 2023).
  15. Négadi, T. Is the genetic code better described by elementary number theory? Acad. Lett. 2021, 2, 1004. [Google Scholar] [CrossRef]
  16. Available online: https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?chapter=tgencodes#SG2 (accessed on 23 December 2023).
Figure 1. Proline (the molecule). The side chain is boxed in red color and the arrow is the possible transfer of one hydrogen atom (or one nucleon) from the side chain to the backbone (see text).
Figure 1. Proline (the molecule). The side chain is boxed in red color and the arrow is the possible transfer of one hydrogen atom (or one nucleon) from the side chain to the backbone (see text).
Symmetry 16 00293 g001
Table 1. The five multiplets of the standard genetic code. The first column lists the five multiplets and their number. The second column gives, in parentheses, the corresponding amino acids and their three-letter and one-letter codes.
Table 1. The five multiplets of the standard genetic code. The first column lists the five multiplets and their number. The second column gives, in parentheses, the corresponding amino acids and their three-letter and one-letter codes.
MultipletsAmino Acids
3 sextetsserine (Ser, S), arginine (Arg, R), leucine (Leu, L)
5 quartetsproline (Pro, P), alanine (Ala, A), threonine (Thr, T), valine (Val, V), glycine Gly, G)
1 tripletisoleucine (Ile, I),
9 doubletsphenylalanine (Phe, F), tyrosine (Tyr, Y), cysteine (Cys, C), histidine (His, H), glutamine (Gln, Q), glutamic acid (Glu, E), aspartic acid (Asp, D), asparagine (Asn, N), lysine (Lys, K)
2 singletsMethionine (Met, M), tryptophan (Trp, W)
Table 2. The genetic code table [1]. The 64 codons and their encoded amino acids (in the three-letter code) are shown. Several colors and hues are used to differentiate between the encoded amino acids. For example, leucine (Leu) and its six codons are in light blue. For the five amino acids in similarred colors, their RGB code is: Arg (255, 140, 102), Ala (255, 33, 33), Val (255, 102, 102), Tyr (255, 102, 140), Asp (255, 117, 102).
Table 2. The genetic code table [1]. The 64 codons and their encoded amino acids (in the three-letter code) are shown. Several colors and hues are used to differentiate between the encoded amino acids. For example, leucine (Leu) and its six codons are in light blue. For the five amino acids in similarred colors, their RGB code is: Arg (255, 140, 102), Ala (255, 33, 33), Val (255, 102, 102), Tyr (255, 102, 140), Asp (255, 117, 102).
UUU−PheUUC−PheUCU−SerUCC−SerCUULeuCUCLeuCCUProCCCPro
UUALeuUUGLeuUCA−SerUCG−SerCUALeuCUGLeuCCAProCCGPro
UAUTyrUACTyrUGUCysUGCCysCAU−HisCAC−HisCGUArgCGCArg
UAAStopUAGStopUGAStopUGG−TrpCAA−GlnCAG−GlnCGAArgCGGArg
AUUIleAUCIleACU−ThrACC−ThrGUUValGUCValGCUAlaGCCAla
AUAIleAUGMetACA−ThrACG−ThrGUAValGUGValGCAAlaGCGAla
AAUAsnAACAsnAGU−SerAGC−SerGAUAspGACAspGGU−GlyGGC−Gly
AAALysAAGLysAGAArgAGGArgGAAGluGAGGluGGA−GlyGGG−Gly
Table 3. The elemental composition of the side chains of the 20 amino acids. The first column gives the multiplicity M, or the number of codons encoding the amino acid. The following columns give respectively the number of hydrogen atoms (H); carbon atoms (C); nitrogen, oxygen, and sulfur atoms (N/O/S); the total number of atoms (column 6); and the integer molecular mass or nucleon number (column 7). The positive charges are indicated in blue and the negative charges are indicated in red (see text for explanations about the five last rows).
Table 3. The elemental composition of the side chains of the 20 amino acids. The first column gives the multiplicity M, or the number of codons encoding the amino acid. The following columns give respectively the number of hydrogen atoms (H); carbon atoms (C); nitrogen, oxygen, and sulfur atoms (N/O/S); the total number of atoms (column 6); and the integer molecular mass or nucleon number (column 7). The positive charges are indicated in blue and the negative charges are indicated in red (see text for explanations about the five last rows).
MAmino Acid# H# C# N/O/S# Atoms# Nucleons
4Proline (Pro) on/off5 (+1)308 (+1)41 (+1)
Alanine (Ala)310415
Threonine (Thr)520/1/0845
Valine (Val)7301043
Glycine (Gly)10011
6Serine (Ser)310/1/0531
Leucine (Leu)9401357
Arginine (Arg)10 (+1)43/0/017 (+1)100 (+1)
2Phenylalanine (Phe)7701491
Tyrosine (Tyr)770/1/015107
Cysteine (Cys)310/0/1547
Histidine (His)542/0/01181
Glutamine (Gln)631/1/01172
Asparagine (Asn)421/1/0858
Lysine (Lys)10 (+1)41/0/015 (+1)72 (+1)
Aspartic Acid (Asp)3 (−1)20/2/07 (−1)59 (−1)
Glutamic Acid (Glu)5 (−1)30/2/010 (−1)73 (−1)
3Isoleucine (Ile)9401357
1Methionine (Met)730/0/11175
Tryptophan (Trp)891/0/018130
Total (20) on/off117/1186720204/2051255/1256
Total (23) on/off140/1417624240/2411444/1445
Total (38) on/off222/22510432358/3611964/1967
Total (61) on/off362/36618056598/6023408/3412
M 1 / M 2 o n o f f 176/186
180/186
268/330
272/330
1336/2072
1340/2072
Table 4. The first terms of the Fibonacci-like sequences a n , a n , b n , c n ,   a n d   g n [1]. The first column gives the initial conditions, p and q and also the name of each sequence (see text).
Table 4. The first terms of the Fibonacci-like sequences a n , a n , b n , c n ,   a n d   g n [1]. The first column gives the initial conditions, p and q and also the name of each sequence (see text).
n123456789101112131415
p = 1, q = 6 a n 6178152338619916025941967810971775
p = 6, q = 1 a n 167132033538613922536458995315422495
p = 9, q = 13 b n 139223153841372213585799371516245339696422
p = 5, q = 30 c n 30535407511519030549580012952095339054858875
p = −3, q = 23 g n 23−32017375491145236381617998161526134228
Table 5. Rumer’s division of the genetic code table into two sets M 1 and M 2 . The symbols and the colors are the same as those in Table 2. The set M 1 comprising eight quartets of codons is shown in grey background. The set M 2 constituting the remaining part of the table comprises the nine doublets, the three doublet parts of the three sextets, the triplet, the two singlets, and the three stops.
Table 5. Rumer’s division of the genetic code table into two sets M 1 and M 2 . The symbols and the colors are the same as those in Table 2. The set M 1 comprising eight quartets of codons is shown in grey background. The set M 2 constituting the remaining part of the table comprises the nine doublets, the three doublet parts of the three sextets, the triplet, the two singlets, and the three stops.
UUU−PheUUC−PheUCU−SerUCC−SerCUULeuCUCLeuCCUProCCCPro
UUALeuUUGLeuUCA−SerUCG−SerCUALeuCUGLeuCCAProCCGPro
UAUTyrUACTyrUGUCysUGCCysCAU−HisCAC−HisCGUArgCGCArg
UAAStopUAGStopUGAStopUGG−TrpCAA−GlnCAG−GlnCGAArgCGGArg
AUUIleAUCIleACU−ThrACC−ThrGUUValGUCValGCUAlaGCCAla
AUAIleAUGMetACA−ThrACG−ThrGUAValGUGValGCAAlaGCGAla
AAUAsnAACAsnAGU−SerAGC−SerGAUAspGACAspGGU−GlyGGC−Gly
AAALysAAGLysAGAArgAGGArgGAAGluGAGGluGGA−GlyGGG−Gly
Table 6. The 3rd-base classification of the 64 codons [7]. The 16 codons each ending in U, C, A, or G are gathered in columns 1, 3, 5, and 7, respectively, and their encoded amino acids are indicated in columns 2, 4, 6, and 8. The three stop codons are indicated in columns 5 and 7. For the symbols in the first row, see text. The numbers in the last two rows are explained in this section.
Table 6. The 3rd-base classification of the 64 codons [7]. The 16 codons each ending in U, C, A, or G are gathered in columns 1, 3, 5, and 7, respectively, and their encoded amino acids are indicated in columns 2, 4, 6, and 8. The three stop codons are indicated in columns 5 and 7. For the symbols in the first row, see text. The numbers in the last two rows are explained in this section.
C U f C U C C f C C C A f C A C G f C G
UCUSerUCCSerUCASerUCGSer
AGUSerAGCSerAGAArgAGGArg
CGUArgCGCArgCGAArgCGGArg
CUULeuCUCLeuCUALeuCUGLeu
GCUAlaGCCAlaUUALeuUUGLeu
GUUValGUCValGCAAlaGCGAla
CCUProCCCProGUAValGUGVal
GGUGlyGGCGlyCCAProCCGPro
ACUThrACCThrGGAGlyGGGGly
UUUPheUUCPheACAThrACGThr
UAUTyrUACTyrCAAGlnCAGGln
UGUCysUGCCysAAALysAAGLys
CAUHisCACHisGAAGluGAGGlu
GAUAspGACAspUAAStopUAGStop
AAUAsnAACAsnUGAStopUGGTrp
AUUIleAUCIleAUAIleAUGMet
H on/off84/85 84/85 94/95 100/101
At. on/off144/145 144/145 147/148 163/164
Table 7. The Rosandić and Parr “ideal” symmetry classification scheme [9]. The symbols and the colors are the same as in Table 2 but, here, the division into subsets is as follows: the “leading” group is shown in yellow (A+U rich) and orange (G+C rich) while the “non-leading” group is shown in light grey (A+U rich) and light blue (C+G rich). (The codons of the three sextets are underlined; see text).
Table 7. The Rosandić and Parr “ideal” symmetry classification scheme [9]. The symbols and the colors are the same as in Table 2 but, here, the division into subsets is as follows: the “leading” group is shown in yellow (A+U rich) and orange (G+C rich) while the “non-leading” group is shown in light grey (A+U rich) and light blue (C+G rich). (The codons of the three sextets are underlined; see text).
UUU−PheUUC−PheUCU−SerUCC−SerCUULeuCUCLeuCCUProCCCPro
UUALeuUUGLeuUCA−SerUCG−SerCUALeuCUGLeuCCAProCCGPro
UAUTyrUACTyrUGUCysUGCCysCAU−HisCAC−HisCGUArgCGCArg
UAAStopUAGStopUGAStopUGG−TrpCAA−GlnCAG−GlnCGAArgCGGArg
AUUIleAUCIleACU−ThrACC−ThrGUUValGUCValGCUAlaGCCAla
AUAIleAUGMetACA−ThrACG−ThrGUAValGUGValGCAAlaGCGAla
AAUAsnAACAsnAGU−SerAGC−SerGAUAspGACAspGGU−GlyGGC−Gly
AAALysAAGLysAGAArgAGGArgGAAGluGAGGluGGA−GlyGGG−Gly
Table 8. The “supersymmetry” genetic code table, reproduced from [10] except for colors. Here, the background colors for the subsets are the same as those in Table 7: “leading” group (A+U rich/G+C rich) and “non-leading” group (A+U rich/C+G rich). The two “mirror” symmetry axes (vertical and horizontal) are shown in thick dotted lines. In columns 4 and 5, purine: 0, pyrimidine: 1 as in reference [10]. The first column indicates the boxes: direct box (DB) and complement box (CB).
Table 8. The “supersymmetry” genetic code table, reproduced from [10] except for colors. Here, the background colors for the subsets are the same as those in Table 7: “leading” group (A+U rich/G+C rich) and “non-leading” group (A+U rich/C+G rich). The two “mirror” symmetry axes (vertical and horizontal) are shown in thick dotted lines. In columns 4 and 5, purine: 0, pyrimidine: 1 as in reference [10]. The first column indicates the boxes: direct box (DB) and complement box (CB).
BoxesaaCodonsPu/PyPu/PyCodonsaa
DBStartAUG1010GCAA
IAUA1010GCGA
IAUC1111GCUA
IAUU1111GCCA
CBYUAC101101CGUR
YUAU101101CGCR
StopUAG100100CGAR
StopUAA100100CGGR
DBEGAG00AGAR
EGAA00AGGR
DGAC11AGUS
DGAU11AGCS
CBLCUC111111UCUS
LCUU111111UCCS
LCUG110110UCAS
LCUA110110UCGS
DBLUUA110110CCGP
LUUG110110CCAP
FUUU111111CCCP
FUUC111111CCUP
CBNAAU11GGCG
NAAC11GGUG
KAAA00GGGG
KAAG00GGAG
DBQCAA100100UGGW
QCAG100100UGAStop
HCAU101101UGCC
HCAC101101UGUC
CBVGUU1111ACCT
VGUC1111ACUT
VGUA1010ACGT
VGUG1010ACAT
Column 1Column 2
Table 9. The Alternative Yeast Nuclear Code (#12 in [16]). The symbols and the colors are the same as those in Table 2. The multiplets are the same as those of the standard genetic code of Table 2 except for the reassignment of the codons of the two amino acids serine (orange background) and leucine (light blue background), see text for the details.
Table 9. The Alternative Yeast Nuclear Code (#12 in [16]). The symbols and the colors are the same as those in Table 2. The multiplets are the same as those of the standard genetic code of Table 2 except for the reassignment of the codons of the two amino acids serine (orange background) and leucine (light blue background), see text for the details.
UUU−PheUUC−PheUCU−SerUCC−SerCUULeuCUCLeuCCUProCCCPro
UUALeuUUGLeuUCA−SerUCG−SerCUALeuCUG−SerCCAProCCGPro
UAUTyrUACTyrUGUCysUGCCysCAU−HisCAC−HisCGUArgCGCArg
UAAStopUAGStopUGAStopUGG−TrpCAA−GlnCAG−GlnCGAArgCGGArg
AUUIleAUCIleACU−ThrACC−ThrGUUValGUCValGCUAlaGCCAla
AUAIleAUGMetACA−ThrACG−ThrGUAValGUGValGCAAlaGCGAla
AAUAsnAACAsnAGU−SerAGC−SerGAUAspGACAspGGU−GlyGGC−Gly
AAALysAAGLysAGAArgAGGArgGAAGluGAGGluGGA−GlyGGG−Gly
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Négadi, T. Fibonacci-like Sequences Reveal the Genetic Code Symmetries, Also When the Amino Acids Are in a Physiological Environment. Symmetry 2024, 16, 293. https://doi.org/10.3390/sym16030293

AMA Style

Négadi T. Fibonacci-like Sequences Reveal the Genetic Code Symmetries, Also When the Amino Acids Are in a Physiological Environment. Symmetry. 2024; 16(3):293. https://doi.org/10.3390/sym16030293

Chicago/Turabian Style

Négadi, Tidjani. 2024. "Fibonacci-like Sequences Reveal the Genetic Code Symmetries, Also When the Amino Acids Are in a Physiological Environment" Symmetry 16, no. 3: 293. https://doi.org/10.3390/sym16030293

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop