The Impact of Sequencing Genomes on the Understanding of the Origin of Life on Earth
Historical Background
Has all life always been on planet Earth as it is today? Why don’t we see in our Zoo animals like Dinosaurs, Woolly Mammoths, Mastodons or Saber Tooth Tigers? The answer is that life has not always been on planet Earth as it is today. All ancient animals are dead and gone and become extinct. The nucleus of all living creatures carries DNA (Deoxy Ribonucleic Acid) which holds the information not only to store the information but also to pass it on to the next generation. DNA is made of four chemicals called nucleotides bases and they are Adenine (A), Thiamine (T), Guanine (G) and Cytosine (C). They come in pairs. The pairing of A/T bases and G/C bases imply a double helical structure. The one chain of A/T bases running opposite of the other chain of G/C bases. The double helical structure of the nucleotide base pairs solves the mystery of life. It shows how the information to create life is stored, copied and passed on to the future generations.
The essence of life is information and the information to convert non-living chemicals to living creatures is written on the double helical structure DNA. The living cell has no Soul, no Holy Spirit, no Vital Force and no Devine Intervention. Now, we know that Life is a series of coordinated chemical reactions of nucleotide bases. Different life forms are the result of the slow process of Natural Selection. Once we discovered that the secret of life resides on DNA, we could manipulate life that is we could cut paste and copy the DNA to create a new life form in the test-tube that never existed before. This new life form will carry instructions not only to clean up our environmental pollution, but also to provide the most nutritious food for the burgeoning population of the world, to provide new fuel to run the engine of modern society, and to provide new medicine to treat every disease known to mankind.
If you are a religious person and believe that creation of life is a miracle. You understand the origin of life from a different point of view that is from the point of view of religious faith based on belief system that says that there is a Creator who created life on Earth. What it tells me is that to know the way to Creator is to study Creation of life? Most religions believe that Heaven declared the Glory of God, it shows his handy work, there is a Creator who has done it. If that is true than I must study creation trying to read the mind of God. There is a tremendous creation of drive build into the Cosmos. I respect your belief. For scientists, evolution of life is the result of four and a half billion years of biological evolution. We see life evolves everywhere on Earth. For example, during springtime, you see the plants grow and flowers bloom; we see children are born and grow up. Life begins with a single cell and grows up to become a full human being. It is not a violation of natural laws. It is a fact of the fundamental Natural Law. I don’t see the conflict between Science and Religions. I respect people of Faith. Some people have deep faith, that is their spiritual way of knowing the Origin and the Creation of life on Earth.
I am a scientist. I look at the origin of life from a different point of view, the rational and scientific way. The Objective truth is verified by the experimental evidence. For example, Water boils at one hundred degrees Centigrade and freeze at Zero degree Centigrade. You could conduct this experiment either in New York or New Delhi, the result should be reproducible, and verifiable. This is how I see the world. Your way of looking at life is different from mine. Your point of view is different, but it is not wrong. It is different. It was Charles Darwin who provided the most rational answer. Charles Darwin was one of the greatest biologists ever lived. In 1859, in his book, the Origin of Species, he stated that Life evolves, and Nature selects. What he meant was that the designs and complexity of living creatures on Earth was due to slow evolutionary processes from the simplest to the most complex species is not by the act of any Divine Intervention, but by the slow process of Natural Selection responding to the surrounding environment.
Species which evolve traits over billions of years to respond to the changing environment survive and the rest of the species that resist evolution die. Their fossils remained trapped in the layers of rocks as the proof of their existence. Soon after the formation of our Solar System about four and a half billion years ago, the hot surface of our Earth cooled. The ancient fossil records show that within a half a billion years, the first life form appeared called the Pre-Cambrian era. During the Pre-Cambrian era which lasted for about 25 million years, there were hundreds of new species evolved from Pre-Cambrian era to the Cambrian Explosion. Most of the pre-Cambrian life forms were unicellular soft tissues creatures which decomposed over the years and their fossils impressions on the rock could be preserved. Only creatures evolved hard shells near the beginning of the Cambrian Explosion were fossilized in the earliest sedimentary rocks. From the pre-Cambrian era, the only creatures that left their fossils behind are the Trilobites, the multicellular crab like creatures which crawled at the bottom of the ancients’ riverbeds.
Darwin critiques argue that the earliest life should be unicellular creatures not multicellular Trilobites. They forgot that unicellular soft tissue creatures don’t fossilized and there were millions of soft tissue creatures during the Pre-Cambrian Era. As we approach near the Cambrian Explosion during the 25 million years, the multicellular hard-shell creatures appeared. The only hard-shell creatures from the Pre-Cambrian era like Trilobites left their fossils behind. Darwin’s critiques will be proved wrong. We have recently learned a technique to extract DNA from the fossils. Using the new technique, a group of German scientists extracted DNA from our ancient ancestors Neanderthal and completely sequenced (decoded) their Genome. Neanderthal died over 30,000 years ago. We could use the same technique to extract the DNA of creature of Pre-Cambrian Era. Any fossil or their impression left on the pre-Cebrian rocks could be extracted and sequence to prove the evolution of life from the simplest to the complex forms.
The toolkit developed during the sequencing of human genome helped us sequence the fossils of all ancient creatures for comparison. Now, we can sequence the simplest genome of a microbes, from mouse to monkey to men and compare to see how the simplest to complex organism evolved. How the four nucleotides, the building blocks of life, originated on Earth by the interaction of Carbon, Nitrogen, Oxygen to form nucleotide and how they organized themselves to become alive. Life is a series of coordinated chemical reactions of basic building blocks called the nucleotide bases. If you sequence the genome from the simplest to the most complex life form and compare their genomes, you see how the same four nucleotide aggregate differently over ions in response to the surrounding environment (Figure 1).
Cambrian Explosion
Darwin had the greatest foresight. By comparing the fossils, he brought from Galapagos, he saw the evidence of evolution. Planetology is the study of layers of rocks to trace the evidence and ecology of plants and animals from the distant past to the present day. Most fossils are found in the sedimentary rocks and clay deposited on the layers of rocks. One layer deposited on the top of other. Trapped in these layers are millions of years old fossil at various stages of evolution. As the rivers dried up, the sedimentary rocks become hard. The sedimentary rocks unfold like pages of a gigantic book. The earliest fossil of simple structures is found in the lowest or the oldest layers. As he examined younger and younger rocks, he found complexity of structures. No human bones were ever found in any of these ancient rocks. During the pre-Cambrian era, about 450 million years ago when the climate changed, the Cambrian explosions occurred when the frozen Earth began to warm.
The single cell living creature instead of growing by asexual reproduction began to grow by sexual reproduction. The interaction of two separate chromosomes resulted in variations in gene pool which led to divergence of life forms and evolution from the simplest to the more complex life forms called the Cambrian Explosion of life. The progeny of the recombinant genes produced complexity. Only those recombinant daughter cells which carry genes that produced functional proteins survived and the rest died. The proof of the Cambrian Explosion is trapped in the fossil record which lasted for about 25 million years. Extracting fossils from the ancient, eroded rocks is a real challenge. The erosion of sedimentary rocks over the years is due to rain falls, windstorms, running waters, and transportations of the rocks. Once DNA extraction is purified from the fossils, its genome could be sequenced, and its date could be estimated by Radioactive Dating method (Figure 2).
Radioactive Dating
Radioactive dating provides accurate measurements of the age of the ancient rocks and the fossils trapped inside those rocks. Heavy elements with large nucleus are unstable. Over a long period of time, their nucleus falls apart to more stable elements. By becoming stable, they release radiations such as alpha, beta and gamma radiations and they are called radioactive elements such as Radium, Thorium, Rubidium, Uranium etc. All radioactive elements release radiations at a steady measurable rate over millions of years. By measuring the ratio of the radioactive elements and its unstable end-product, it is possible to measure the age of the rock and the fossils trapped inside that rock. For example, radioactive element Uranium (Atomic weight = 238) over millions of years break down slowly through various isotopes to a more stable element Lead (Atomic Weight = 206). Uranium first breaks down to element Radium (Atomic Weight = 226) which is further broken down to Polonium (Atomic Weight = 218) which is further broken down to its another isotope Polonium (Atomic Weight = 210) and to its most stable element Lead (Atomic Wight = 206). Radioactive decay is a slow process. Over a million-year, one Gram of Uranium break down to 1/7000 Gram of element Lead. By measuring the ratio of the amount of Uranium to the amount of Lead in a rock, we can calculate the age of the Uranium rock and its fossils trapped inside it (Figure 3).
The Geologic Clock
Let me scan the origin of life on planet Earth from the very beginning to the present day. The slow evolutionary changes can be explained from three and a half billion years to the present day. If we were to examine the fossil record based on the Geologic Time Scale, we can divide this time period into three great eras. First, The Paleozoic Era which starts from the very beginning of the Pre-Cambrian Era from the 100 million to 400 million years ago. Second, The Mesozoic Era beginning from 230 million years ago to 70 million years ago. Third, The Cenozoic Era beginning from the 63 million years ago to the present day. During the Pre-Cambrian Era, primitive life forms appeared. No fossil was found of the soft body creatures except some the impression of their fossils are found on the ancient rocks. During the end of the Pre-Cambrian era, some hard-shell fossils, made of Calcium Carbonate, like Trilobites are found. During Cambrian Era, about 100 million years ago, primitive animals appeared such as Algae, Orthopods; much later sponges, worms and mollusks appeared.
A treasure of fossils was discovered in British Columbia, Canada, called the Burgess Shale, found in the Canadian Rockies of Canada. These are the fossils left behind by the Middle Cambrian Era. Part of this treasure is on display at the Smithsonian Museum in Washington, D.C., Among those fossils were sea cucumber, worms and Trilobites. During the Ordovician Period, from 425 to 500 million years ago, bony life forms appeared which include Tetracorals, echinoids, asteroids appeared for the first time on the primitive Earth. During the Silurian Period, (from 425 to 405 million years ago), brought the most dramatic changes in the Earth’s atmosphere. Plants appeared for the first time. Up to this time period, Earth’s atmosphere was full of Nitrogen gas released by the cooling the hot Nitrate Rocks. Earth’s atmosphere also contained the Carbon dioxide contributed by the Volcanic eruption. Plants carry Chloroplast Genome a procryptic life forms captured in an early evolutionary process.
Genes in the Chloroplasts genome have the ability to capture Carbon dioxide from the atmosphere and in the presence of sunlight and convert to its food the Carbohydrate and release Oxygen as the by-product. For the following 60 million years, at the beginning of the Devonian period, the cooled part of the planet Earth, was carpeted by the early plant life called the Blue Green Algae. Its main function was to absorb Carbon dioxide from the atmosphere and convert it to the Carbohydrate its food and release Oxygen as a part of photosynthesis. Over millions of years, enormous amount of Oxygen was released in the atmosphere. By the end of the Silurian Period, the composition of the Earth’s atmosphere was changed from pure Nitrogen gas to 80% Nitrogen and 20% Oxygen gas. The gas Oxygen is extremely reactive, it reacted with the Oceans Iron forming the Iron oxide. Billions of tons of Iron Oxide deposited on the Ocean floor.
Oxygen is toxic to the Anerobic life forms. Creatures survive in the presence of Oxygen thrived while Anerobic life forms died. Complex life forms appeared. In the Oxygen atmosphere, fossils of Fish and Amphibians were found along with the fossils of spiders, millipedes, insects, and corals were discovered. The time, period from 345 to 310 million years is designated as the Mississippian period, during which the fossils of more complex life forms appeared which included Corals, Branchipodids, and Foraminifers. From 210 to 280 million years ago appeared the Great Coal bearing layers of rocks known as the Pennsylvania Period. It saw the low land of great swamp forming the Coal forest. This period saw the appearance of Clams, shell-fish, reptiles, and amphibians. From 280 to 230 million years ago, called the Permian Period which observed the swampy part of the surface developing Coal-forest plants such as Conifers, Tongue-fern, Oak, insects, beetles and dragon flies. Because of the Climatic changes, the plants and animals of the Permian Period become extinct.
This marked the end of the Paleozoic Era. The Mesozoic Era began about 165 million years ago. It brought the Age of Reptilian. With this Era came the Birds, Mammals, Insects and Flowering Plants including Elm, Oak, Maples became common. New mountains range slowly appeared. From 230 to 180 million years ago which is called The Triassic Period saw the appearance of Dinosaurs that mighty beast that ruled Planet Earth for about 150 million years. They all died when a meteorite structs Planet Earth about 65 million years ago. They left behind their foot prints as their fossil around the world. During the Jurassic Period, rain-forest spread everywhere the Dinosaurs dominated the land, but the Ocean was dominated by the Plesiosaur, the monstrous carnivorous of the Seas. The Creataceous Period which begins about 135 to 70 million years ago, marked the development of sedimentary rock made of Chalk. The moment of the Tectonic Plates formed the mountain range from Andes to Rockies, from Antarctica Northwestern Asia. Plants thrived during the Creataceous Period. The fossil record showed the appearance of Trees, shrubs, including Magnolia, Oak, Maples, Birch, Holly, and Ivy which provided food for mammals, birds, reptiles, and insects. Dinosaurs spread on all seven continents. Their fossils are found all over the planet. As I said above, they all disappeared around 65 million years ago, when a meteorite structed at the Northern Mexico. The Cenozoic Era, called The Age of Mammals, began about 70 million years ago to the present day. The climatic changes resulted in the cooling of the polar regions and warming climatic temperature everywhere. This climate change stayed on to the present day. The Cenozoic Era, is dominated by the Flowering Plants, and reptiles are replaced by Mammals. Birds continue to expand everywhere. Finally, the Quaternary Period in which we now live began with the melting of the 10,000 feet thick ice sheet over much of the Northern hemisphere in which four glaciers advanced which lasted about 11,000 years.
As the ice sheet melted away, it created suitable atmosphere for the emergence of human being. Humans appearance on Earth is a matter of only a few million years. Have we found human fossil during any of the Geologic periods from the Pre-Cambrian to Cenozoic Era? The answer is no. In 1974, the first human fossil, Lucy, Australopithecus afarensis skeleton was discovered, in a 3.2-million-year-old rock found by anthropologist Donald Johansson in Hader, valley in Ethiopia. Chimps were living in the Great Rift Valley for the past 25 million years. A more advanced form of the Chimp called Austral opthecus appeared in East Africa. He was an advanced forest man called Homo habilis. He was a hunter gatherer of food who built tools. He was a direct ancestor of Man and who lived about 20 million years ago. Next was the apeman, Pithecanthropus, who lived about 500,000 years ago in Java and China. Neanderthal man lived in Europe. He was also a hunter gatherer and lived about 100,000 years ago.
They all died about 30,000 years ago. Cro-Magnon finally evolved modern brain. Cro-Magnons, a term derived from the Cro-Magnon rock shelter found in southwestern France, where the first human fossils were found in 1868. Darwin’s extraordinary prediction was confirmed by sequencing genomes or reading the book of lives of over a thousand species on Earth. Of all the experiments in Biology, the Sequencing of Human Genome was the greatest accomplishment of all times. On April 3, 2003, Dr. Francis Collins, the Director of my Institutes, The National Institutes of Health (NIH) announced that we read the book in which God created life. We completely read the book of life of a human being letter by letter, word by word, sentence by sentence and chapter by chapter all 46 volumes called the Chromosomes carrying the 24,000 chapters called the genes and its text written in four nucleotides containing six billion four hundred million letters. In a few sentences, he described the completion of the Human Genome Project.
The greatest biological experiment ever conceived by human mind. It will answer the most fundamental questions we have asked ourselves since the dawn of human civilization. What does it mean to be human? What is the nature of our memory and our consciousness and our development from a single cell to a complete human being; the biochemical nature of our senses; the processes of our Aging. The scientific basis of our similarities and dissimilarities. Similarities that all living creatures from a tiny blade of grass to mighty Elephant, including man, mouse, monkey, mosquitoes, and microbes all are made of the same chemical building blocks and yet they are so diverse that no two individuals are alike, even identical twins are not identical they grow up to become two separate individuals. Essential components of life are RNA, DNA, Proteins, Carbohydrates, Lipids, and Hormones. We always wonder how these non-living chemicals could get together to create living creatures. When did life evolve? Where was it evolved? And how life evolved? Evolution of Life on Earth is not a miracle.
Life could have been evolved on Earth’s surface such as on the oldest rocks found in Australia or it could have been evolved at the bottom of the Ocean floor where Black Smokers are formed with Lava emerging from under sea volcanoes reacted with surrounding Hydrogen Sulfide gas which provides energy for life forms such as tubeworms and crabs who thrive on the Ocean floor. Life also could have been evolved underground. Soil sample brought by miners from the gold mines in South Africa two miles deep underground contained micro worms. Such life form could be cultivated on a Petri dish containing Agar mixed with nutrients. Early life could have been unicellular life forms. When harvested within 24 hours, the Petri dish could be filled with microbial life. Could life have been brought on Earth by meteorites. Early Earth has no Water. Billions of Comets brought Water on Earth. Would it be possible that some of those Icy mountains contained life giving essential components? Life could also be evolved on the surface of Earth. A million-lightning strike Earth each day.
Could it be possible that at some remote corner of the Earth, Lightning struck at cloud of gases such as Ammonia, Methane and Sulfur on a Phosphate containing rocks making the essential components of life like nucleotides which combined to form RNA Which not only carry information like DNA and perform function like amino acids. The polymerization of Formaldehyde in the atmosphere could produce Carbohydrates another essential component of life. The presence of Acetonitrile, Carbon dioxide, Water in the presence of Ultraviolet light could produce the nucleotides such as Adenine (A), Thiamine (T), Guanine (G) and Cytosine (C) forming a binary code leading to RNA which start replicating itself creating the first living anaerobic creature. RNA molecule can catalyze reaction like enzyme such as protein, but also it could store information like DNA. Were there creature in the RNA world which thrived in the absence of Oxygen. Since no human was present to witness the formation and evolution of first life on Earth, we rely on its presence from the early fossils found in the layers of ancient rocks.
Once a single replicating living cell appears on Earth, complexity develops. In other words, all complex life forms are evolved from simple life forms. Since no humans were present to observe the beginning of life on Earth, we deduce their evolutionary developments from their fossil records. Fossils are the remains of the pre-historic life forms. To become fossilized, the species must have developed hard parts such as bone or shell and must be trapped in mud which slowly become hard rock. Soft tissue creatures do not fossilize; their tissues decomposed. As I said above, the Earth was formed about four and a half billion years ago. The hot Earth cooled by the bombardment of the icy comets. Every drop of Water on Earth was brought by the icy comets. The first life form appeared on Earth about a billion year after the Earth was formed about four and a half billion years ago. Over billions of years of evolutionary process, chemicals reacted together to create Life. One of the most essential components of Life that is Amino Acid was created in the Lab.
In 1953 Stanley Miller, the student of the Nobel Laureate, Harold Urey, at the Chicago University conducted an experiment in the Lab to create life’s essential components the amino acids. He created primitive Earth like conditions in the Lab. He took two flasks connected with a condenser. One flask contained water vapors and the other filled with gases found on the primitive Earth such as Methane, Carbon dioxide and Ammonia. To mimic thunder and lightning, a source of energy, on Earth, he sparked electric current in the flask. The high energy electric spark, split the stable molecules of Nitrogen, Oxygen, and Carbon, producing extremely reactive ions which reacted with one another recombing to produce a more stable new molecule. Within a week, the clear solution in the flask became pink and dark. The analysis of the colored material showed the formation of Amino Acids, the essential building blocks of life which perform all body functions. In similar experiments, Francis Crick and Lesli Orgel, attempted to synthesize Nucleotides the replicating molecules which carry information to make life. Using Formaldehyde, the other essential components of life such as sugars and hormones were synthesized.
Chromosomes are thread-like structures located inside the nucleus of animal and plant cells. Each chromosome is made of double strand of a long chain of four nucleotides wrapped around with protein called the deoxyribonucleic acid (DNA). As I stated above, this is the information molecule which is passed on from parents to offspring, DNA contains specific instructions that make each type of living creature unique. As the living creatures evolve, the complexity increases as the number of chromosomes increases. The evolution of life on planet Earth is extremely slow process. About a billion years after the formation of Earth that is about four and a half billion years ago, life appeared. Bacteria Phage Phi-X 174 is perhaps the smallest organism, and it is made of over 5,000 nucleotide bases. It carries a single Chromosomes so has most bacteria. As evolution proceeded, chromosome number increases, and complexity appeared in both plants and animals to survive in the changed environment. For example, while bacteria have a single chromosome, Jack Jumper Ant has two Chromosomes.
Yellow fever mosquito has six chromosome; Fruit fly has 8 chromosomes; Swamp wallaby has 10 chromosomes; Nematode has 12 chromosomes; the Australian daisy has 12 chromosomes; the spider mice, Aloe Vera and cucumber has 14 chromosomes; Garlic has 16 chromosomes; Itch Mite has 17 chromosomes; Radish, Carrot, Cabbage and passion fruit have 18 chromosomes; Maize and Cannabis have 20 chromosomes; Bean and Virginia Opossum have 22 chromosomes; Snail, Melon, Rice, Sweet Chestnut have 24 chromosomes; Edible frog has 26 chromosomes; Axolotl has 28 chromosomes; Beg Bug has 29 chromosomes; Giraffe, American mink and Pistachio have 30 chromosomes; Yeast, European honey bee, American badger and Alfalfa have 32 chromosomes; Red fox, Sunflower and Porcupine have 34 Chromosomes; Yellow mongoose, Tibetan sand fox, Starfish, Red panda, Meerkat and Earthworm have 36 chromosomes; Tiger, Sea otter, Sable, Raccoon, Pig, Lion and European mink have 38 chromosomes; Mouse, Mango, Hyena, Ferret, Beaver and Peanut have 40 Chromosomes; Wolverine, Wheat, Rat and Oats have 42 Chromosomes; Dolphin, Sable antelope, and Human have 46 chromosomes; Water buffalo, Tobacco, Potato, Orangutan, hare, Gorilla, Deer mouse, and Chimpanzee have 48 chromosomes; Zebrafish, Water Buffalo, Striped skunk, Pineapple have 50 chromosomes; Spectacled Bear, Platypus, and Cotton have 52 chromosomes; Sheep, Hyrax, Racoon dog and Capuchin monkey have 54 Chromosomes; Strawberry, Gaur, and Elephant have 56 chromosomes; Woolly mammoth has 58 chromosomes; Yak, Goat, cow/Bull, American Bison, Bengal fox, have 60 chromosomes; Gypsy moth, Donkey, and Scarlet Macaw have 62 chromosomes; Mule has 63 chromosomes; Guinea pig, Spotted skunk, Horse and Fennec fox have 64 chromosomes; Gray fox, Red deer, Elk and Roadside hawk have 68 chromosomes; White-tailed deer have 70 chromosomes; Black nightshade and Bat-eared fox have 72 chromosomes; Asiatic black bear, and American black bear have 74 chromosomes; maned wolf, have 76 chromosomes; Grey wolf, Golden Jackal Dog, Dingo have 78 Chromones; Turkey, Sugarcane, and Pigeon have 80 chromosomes; Great white shark have 82 chromosomes; Hedgehog genus have 88 chromosomes; Moon Worts, hedgehog Genus and Grape fern have 90 chromosomes; Pitter’s crab-eating rat. Prawn and Aquatic rat have 92 chromosomes; Kamaraj (fern) have 94 chromosomes; Carp has 100 chromosomes; Red vizcacha rat have 102 chromosomes; Walking catfish has 104 chromosomes; American paddlefish have 120 chromosomes; Northern lamprey has 174 chromosomes; Rattlesnake fern has 184 chromosomes; Red king crab has 208 chromosomes; Field horsetail has 216 chromosomes; A. butterfly has 268 chromosomes; black mulberry has 308 chromosomes; Atlas blue has 448 chromosomes; adderstongue has 1260 chromosomes’ here is a Fern called Ophioglossum, which has the highest number of chromosome count of any known living organism, with 1,260 chromosomes.
This fern has roughly 630 pairs of chromosomes or 1,260 chromosomes per cell. The next generation of scientists will have the opportunity to sequence the genomes of all above species and their genes could be added to our GenBank to be used to develop new food, new fuel and new medicine for the burgeoning population of the world. From the above observations, it became clear that humans are not the panicle of achievement of evolution. Other creatures have more chromosomes than us. Our superiority is in achieving consciousness, our language, and our ability to communicate orally and in writing leaving our knowledge for the future generations (Figure 4).
The Impact of Sequencing Human Genome on the understanding of the Origin of Life
As I said above, our entire book of life is written in four genetic letters called nucleotides in a three-letter code called codon, and they are A (adenine), T (thymine), G (guanine) and C (cytosine). These four chemicals are called nucleotide. The essence of life is information which is carried on these four nucleotides. These nucleotides are found in the nucleus of all living cells including humans, plants, and animals. Instruction in a single gene is written in thousands of AT/GC base pairs that are linked together in a straight line and we call them DNA (Deoxyribose Nucleic Acid) - Nobel prize was awarded to Crick, Watson & Morris Wilkins [1] for discovering the double helical nature of the DNA structure which is transcribed into a single stranded of RNA (in mRNA the less water soluble methyl group in Thiamine, T, is converted to more water soluble Uracil, U, by replacing Methyl group with a Hydroxyl group) which leaves the nucleus and moves into Cytoplasm where it is translated in Ribosomes into Amino Acids leading to proteins). When thousands to millions of AT/GC base pairs contain information to make a single protein, we call that portion of AT/GC base pairs a gene (Nobel Prize was awarded to Khorana & Nauenberg for making a functional gene).
A gene is a string of DNA. The starting Codon for a gene is AUG which codes for the amino acid Methionine after several hundred Codons for different amino acids, comes the stop codon. There are three stop Codons, and they are UGG, UGA, UAG. After the stop Codon, no more amino acids are added to the chain, and DNA synthesis stops. If we count all the AT/GC base pairs in a single cell of our body, we will find that there are 3.2 billion pairs of bases present in the nucleus of every cell. The entire AT/GC sequence of 3.2 billion base-pair is called the Human Genome or the book of our life which carries total genetic information to make us. The reading of the total genetic information that make us human is called the Human Genome. In 1990, US Congress authorized three billion dollars to NIH to decipher the entire Human Genome under the title, “The Human Genome Project.” We found that our genome contains six billion four hundred million nucleotides bases half comes from our father and another half comes from our mother.
Less than two percent of our Genome contains genes which code for proteins. The other 98 percent of our genome contains switches, promoters, terminators etc. The 46 Chromosomes present in each cell of our body are the greatest library of the Human Book of Life on planet Earth. The Chromosomes carry genes which are written in nucleotides. Before sequencing (determining the number and the order of the four nucleotides arranged on a Chromosomes), it is essential to know how many genes are present on each Chromosome in our Genome. The Human Genome Project has identified not only the number of nucleotides on each Chromosome, but also the number of genes on each chromosome. A single cell is so small that we cannot even see with our naked eyes. We must use a powerful microscope to enlarge its internal structure. Under an electron microscope, we can enlarge that one cell up to nearly a million times of its original size. Under the electron microscope, a single cell looks as big as our house.
There is a good metaphor with our house. For example, our house has a kitchen, the cell has a nucleus. Imagine for a moment, that our kitchen has 23 volumes of cookbooks which contain 24,000 recipes to make different dishes for our breakfast, lunch and dinner. The nucleus has 23 pairs of chromosomes which contain 24,000 genes which carry instructions to make proteins. Proteins interact to make cells; cells interact to make tissues; tissues interact to make an organ and several organs interact to make a man, a mouse or a monkey. In every cell of our body, we carry sixteen thousand good genes, six thousand mutated (bad) genes responsible for six thousand diseases and two thousand Pseudo-genes that have lost their functions, during evolutionary time. Our genome contains six billion four hundred million nucleotides bases half comes from our father and another half comes from our mother. Less than two percent of our Genome contains genes which code for proteins. The other 98 percent of our genome contains switches, promoters, terminators etc.
The 46 chromosomes present in each cell of our body are the greatest library of the Human Book of Life on planet Earth. The Chromosomes carry genes which are written in nucleotides. Before sequencing (determining the number and the order of the four nucleotides on a chromosomes), it is essential to know how many genes are present on each chromosome in our Genome (Figure 5).
The Human Genome: The greatest Catalog of Human Genes on planet Earth
Human Genome contains a catalog of traits written on genes in nucleotide sequence. Our Genome also provides a catalog of all 24,000 genes; it also provides the number and location of each gene on the chromosome. The catalog provides 16,000 good genes, 6,000 bad genes and 2,000 pseudogenes (they lost their function). The Human Genome Project has identified the following genes on each chromosome: We found that the chromosome-1 is the largest chromosome carrying 263 million A, T, G and C nucleotide bases and it has only 2,610 genes. The chromosome-2 contains 255 million nucleotides bases and has only 1,748 genes. The chromosome-3 contains 214 million nucleotide bases and carries 1,381 genes. The chromosome-4 contains 203 million nucleotide bases and carries 1,024 genes. The chromosome-5 contains 194 million nucleotide bases and carries 1,190 genes. The chromosome-6 contains 183 million nucleotide bases and carries 1,394 genes. The chromosome-7 contains 171 million nucleotide bases and carries 1,378 genes. The chromosome-8 contains 155 million nucleotide bases and carries 927 genes.
The chromosome-9 contains 145 million nucleotide bases and carries 1,076 genes. The chromosome-10 contains 144 million nucleotide bases and carries 983 genes. The chromosome-11 contains 144 million nucleotide bases and carries 1,692 genes. The chromosome-12 contains 143 million nucleotide bases and carries 1,268 genes. The chromosome-13 contains 114 million nucleotide bases and carries 496 genes. The chromosome-14 contains 109 million nucleotide bases and carries 1,173 genes. The chromosome-15 contains 106 million nucleotide bases and carries 906 genes. The chromosome- 16 contains 98 million nucleotide bases and carries 1,032 genes. The chromosome-17 contains 92 million nucleotide bases and carries 1,394 genes. The chromosome-18 contains 85 million nucleotide bases and carries 400 genes. The chromosome-19 contains 67 million nucleotide bases and carries 1,592 genes. The chromosome-20 contains 72 million nucleotide bases and carries 710 genes.
The chromosome-21 contains 50 million nucleotide bases and carries 337 genes. The chromosome-22 contains 56 million nucleotide bases and carries 701 genes. Finally, the sex chromosome of all females called the chromosome-X contains 164 million nucleotide bases and carries 1,141 genes. The male sperm called chromosome-Y contains 59 million nucleotide bases and carries 255 genes. If you add up all genes in the 23 pairs of chromosomes, they come up to 26,808 genes and yet we keep on mentioning 24,000 genes needed to keep us function normally. A gene codes for a protein, not all 24,000 genes code for proteins. It is estimated that less than 19,000 genes code for protein. Because of the alternative splicing, each gene codes for more than one protein. All the genes in our body make less than 50,000 protein which interact in millions of different ways to give a single cell. Millions of cells interact to give a tissue and hundreds of tissues interact to give an organ and several organs interact to make a human [1-6].
Not all genes act simultaneously to make us function normally. Current studies show that a minimum of 2,000 genes are enough to keep human function normally; the remaining genes are backup support system, and they are used when needed. The remaining genes are called the pseudo genes. For example, millions of years ago, humans and dogs shared some of the same ancestral genes; we both carry the same olfactory genes needed to search for food in dogs. Since humans don’t use these genes to smell for searching food, these genes are broken and lost their functions in humans, but we still carry them. We call them Pseudo genes. Recently, some Japanese scientists have activated the pseudo genes, this work may create ethical problem in future as more and more pseudo genes are activated. Nature has good reasons to shut off those pseudogenes. Our Genome provides the genetic road map of all our genes, past, present and future.
For example, it can tell us how many good or bad genes we inherit from our parents and how many of those gene we are going to pass on to our children. If a family has too many bad genes, and have a family history serious illnesses, they can break off the information flow and stop having children or stop donating mutated eggs and sperms. On April 3, 2003, several groups simultaneously sequenced the entire Human Genome and confirmed that less than two percent of the Genome codes for proteins the rest is the noncoding regions which contains switches to turn the genes on or off, pieces of DNA which act as promoters and enhancers of the genes. Using restriction enzymes (which act as molecular scissors), we can cut, paste, and copy genetic letters in the non-coding region which could serve as markers and which has no effect, but a slight change called mutations in the coding region makes a normal cell abnormal or cancerous (Figure 6).
Our Search for Unknown Diseases Has Come to A Closure
There are two most powerful implications of the human Genome Sequencing. One of them is that we have come to closure. What it means is that we have the catalog of all genes in the Human Genome, we can search the entire genome and locate the desired gene. we will not wonder in the wilderness anymore. Everything there is to know about human health and traits are written on these genes in nucleotide sequences. Our Genomes provides the catalog of all genes.
Reference Sequence
We can scan the whole genome (Reference Sequence) for its response to a given situation. When we look at a normal cell and compare with an abnormal cell, we see the differences. Or when we compare their gene expression looking for a specific proteins, from a specific genes and for a specific nucleotide sequence, we can identify a specific mutation responsible for the disease. In the olden days, before sequencing human genome, when a patient visits a physician for some unknown ailment, the Physician would order several tests and would say to his patient, I don’t know what is wrong with you, I will see if any of these tests show if my guess is right and if he is wrong, he will recommend few more tests to see if he could identify the illness. The guesswork and the trial-anderror days are over. Now, after sequencing the human genome, the physician would say to his patient, I don’t know what is wrong with you, but I know where to find it. It is written in your Genome. It would be easy for a Physician to scan the patient entire genome and compare against the Reference Sequence to identify the mutations responsible for causing the disease.
He will refer the patient to a biotechnology Lab. The Lab Technician will take a small blood sample from the patient, separate his WBC, extract DNA, sequence his Genome and compare with the Reference Sequence letter by letter, word by word by word and sentence by sentence and send the result to the Physician who can easily identify the mutations responsible for causing the disease. The result will provide the best diagnostic method to identify a disease. Our Genome is not just a diagnostic road map of our genes, it tells us to clone the good genes and shut off the bad genes. Using the good genes, it also tells us to make its large-scale protein for worldwide use such as Insulin and Human growth hormone. On the other hand, identifying the bad genes and tell us to design novel drugs to shut off bad genes responsible for causing serious diseases.
We have already demonstrated that using the genetic engineering techniques, we can cut, paste, copy, and sequence a good gene for industrial scale preparation as I said above such as Insulin to treat 300 million of diabetic around the world. Genome sequencing of bad genes start a new era of Genomic Medicine which is based on the development of new drugs for treating a disease based on the genetic make-up of the individuals. The next step would be to design drugs to shut off the mutated genes. Gene Therapy will work if the disease is caused by a single gene mutation. Drug Therapy will work if multiple genes are responsible for causing diseases such as Cancers, Cardiovascular diseases, and Alzheimer.
Genomic Medicine
The first step is to cut the human genome with specific enzyme (prepare a Restriction Site Map) at the specific sites using restriction enzymes (molecular scissors such as EcoR1) first accomplished by El Salvador Luria, Max Delbruck, and Hamilton Smith. The fragment of human DNA (a single gene) if not protected will be destroyed by antibody. A naked gene is a piece of DNA (which has a start codon AUG and after a few thousand nucleotide (codons) end at one of the three stop codons UAG, UGA or UGG if not protected by recombinant technology (making a hybrid) that is by recombining with the DNA of Virus, or Plasmids, or Chloroplasts (for plants) which serves as Vectors, will be destroyed by enzymes. One can store the fragments or genes in the Vectors once the human DNA fragment is stabilized in Vectors by recombinant technology; we can not only purify this fragment (genes), but also, we can make millions of copies (clone) of this fragment of DNA by transferring into the host cells such as Bacteria, mammalian cells or Yeast cell which autonomously replicates to produce library of genes.
Each Library contains millions of copies of identical genes that produce same protein. Before the genetic revolution, Insulin is extracted from pancreas of the slaughtered animals which is used to treat old diseases such as diabetes; a tiny fragment of impurity could set anaphylactic shock and kill the patients. Now, highly pure human Insulin produced by Genetic Engineering is used to treat 300 million diabetic patients worldwide without the loss of a single life using the same recombinant technology. Other products of Genomic Medicine such as Growth hormones and hormone proteins to treat Hemophilia by factor VIII protein are being developed as genomic medicines by recombinant technology. The essence of life is information, and the information is located on the four nucleotide bases A-T and G-C. According to Central Dogma of Crick and Watson, the information on DNA is transcribed on RNA which is translated in Ribosome to protein. Attempts are being made to design drugs to attack cancer cells on all three levels that is DNA, RNA and Protein.
Herceptin, a novel class of drug, has been successful in attacking protein. Craig Milo has designed double stranded RNA to shut off gene and prevents its translation into protein. Attack on DNA to shut off a gene was carried out by Ross using highly toxic Nitrogen Mustard. Gene Therapy cannot be applied to multiple genetic defects such as cancers or heart diseases. Drug Therapy could be used to develop novel treatments. Professor WCJ Ross of London University was the first person who designed drugs to attack DNA for Cancer Treatment. He designed drugs to cross-link both strands of DNA that we inherit one strand from each parent. Cross-linking agents such as Nitrogen mustard. The analogs of Nitrogen mustard are extremely toxic and were used as chemical weapon during the First World War. Hundreds of more toxic analogs of Nitrogen Mustard were developed during the Second World War. Solders exposed to Nitrogen Mustard showed a sharp decline of White Blood Cells (WBC) from 5000 cell/CC to 500/CC [7,8].
Children suffering from Childhood Leukemia have a very WBC count over 90,000/CC. Most of the WBCs are premature, defected, and unable to defend the body from microbial infections. Ross rationale was that cancer cells divide faster than the normal cell, by using Nitrogen Mustard he could cross linking DNA and prevent cell division. Once he demonstrated that he could shut off a gene by cross-linking DNA; he could shut off any mutated gene of all 220 tissues present in a human by finding a dye that could specifically color that tissue. He could attach the Nitrogen Mustard group to the dye and attack the cancer genes in any one those tissues. Ross was the first person to use war chemicals successfully to treat cancer. Although such drugs are highly toxic more cancer cell will be destroyed than the normal cells. Over decades, Ross made several hundred derivatives of Nitrogen Mustard as cross-linking agents. Some of the Nitrogen Mustards are useful for treating cancers such as Chlorambucil for treating childhood leukemia (which brought down the WBC level down to 5,000/CC) and Melphalan and Myrophine for treating Pharyngeal Carcinomas.
[9-15]. Because of the high toxicity of Nitrogen Mustard, new drugs could not be developed to treat other types of Oral or Lung Cancers. As I showed above, we sequenced our entire genome, our book of life, letter by letter word by word, sentence by sentence, chapter by chapter all forty-six volumes written in six billion four hundred million genetic letters (nucleotide) of a healthy human being under the Human Genome Project. We can use our healthy Genome as a Reference Sequence for comparison. Using nano capillary method, it took us 13 years to sequence the entire human genome at a cost of $3 billion. Now, we have developed next generation sequencers like Nanopore technology which will sequence the entire genome cheaper and faster. Using biopsy sample, we can take a single cell from the Lung or Oral tumor of smoker, sequence its genome, and compare with the Reference sequence to identify the number and location of all mutations or damage genes caused by smoking. Recently, we also completed the 1000-genome project which will provide thousand copies of the same gene for comparison.
We also learned to convert Analog language of Biology into the Digital language of computer. Now, we can write a program and design a computer to read and compare at the speed of light to some other country. When comparing with the Reference Sequence with the smoker’s gene sequence, it will identify all the mutations with precision and accuracy. Once the mutations responsible for causing Lung or Oral Carcinoma are identified, we can design drugs to shut off those genes. At the London University, I was a graduate student of Professor Ross then his Post-doctoral Fellow and then his Special Assistant. For almost ten years, I worked with Professor Ross making derivatives of Nitrogen Mustard as anticancer agents. While Professor Ross was designing drugs to attack both strands of DNA which are extremely toxic, as a part of my doctoral thesis, I was assigned to design drugs to attack a single strand of DNA.
I was successful in designing a novel class of drugs which attack only one strand of DNA. This class of drugs is called Aziridines [16-18]. I made over 100 Aziridine dinitro-benzamide (CB1954) analogs which attack the DNA of Walker Carcinoma 256 in Rat, a solid aggressive tumor. Using the same rationale, it has taken me about ten years to make (CB1954), a novel drug to shut off a mutated gene responsible for causing Walker Carcinoma 256, a solid aggressive tumor in Rat and about a quarter of a century to make AZQ to shut off Glioblastoma gene in human responsible for causing brain tumor. The following example explains how easy it is to get Lung or Oral cancer by simply smoking a dozen of genetically enhance high Nicotine content Cigarette and how expensive, timeconsuming, and exhaustive it is to find a possible cure. The Drug must be safe and effective. After a year use, if the FDA receives an Adverse Effect Report, the Drug is withdrawn.
All the effort is wasted. Toxicity is measured as the ratio between toxicity of normal cell compared to the abnormal cell. The ratio is called the Therapeutic Index (TI). The TI of most Crosslinking Nitrogen Mustard are ten, the Therapeutic Index of one of the Aziridine (Aziridine dinitro benzamide) CB1954 is (T/I = 70) which showed that CB1954 is seventy time more toxic to cancer cells compared to normal cells. The Walker Tumor not only stopped growing but also it shrank to normal size. I used a simple rationale, the Aziridine attacks a single strand of DNA in acidic medium, particularly the N-7 Guanine. The dye Dinitro-benzamide has great affinity for Walker Tumor. The Aziridine dinitro benzamide (CB1954) stain the tumor. CB1954 acts as a Prodrug that is it remains inactive at neutral or basic pH but activated in acidic solution. As the tumor grows, it uses Glucose as a source of energy. Glucose is broken down to Lactic acid. It is the acid which attacks the Aziridine ring. The ring opens to generate a Carbonium ion which attacks the single strand of most negatively charged N-7 Guanine shutting off the Walker Carcinoma gene.
To continue my work, I was honored with the Institute of Cancer Research post-doctoral fellowship award of the Royal Cancer Hospital of London University. To increase the toxicity of CB1954 to Walker Carcinoma, I made additional 20 analogs. When I attached one more Carbonium generating moiety, Carbamate to the Aziridine Dinitrobenzene, the compound Aziridine Dinitrobenzene Carbamate was so toxic that its Therapeutic Index could not be measured. Because of the safety reason, further work at the London University was stopped. I used the same rationale to continue my work in America when I was offered the Fogarty International Postdoctoral Fellowship Award to continue my work at the National Cancer Institute (NCI) of the National Institutes of Health (NIH) in Bethesda, Maryland, USA. I brought the idea from London University of attacking one strand of DNA using Aziridine, but I do not want to use the same dye Dinitro benzamide.
One day, I heard a lecture at NIH in which the speaker stated that methylated radio labeled Quinone crosses the Blood Brain Barrier. When radiolabeled Quinone is injected intravenously in mice, the entire radioactivity was concentrated in the Brain within 24 hours. I knew that Glioblastoma multiforme, the brain tumor in humans, is a solid aggressive tumor like Walker Carcinoma in Rats. I decided to use Quinone moiety as a carrier for Aziridine rings to attack Glioblastoma. I remember by introducing just one Aziridine and one Carbamate moiety to Dinitro Benzine ring, at the London University I produced such a toxic compound against tumors whose toxicity could not be measured. With the Quinone ring, I could introduce two Aziridine rings and two Carbamate moieties and could create havoc for Glioblastoma. Within three years, I made 45 analogs of Quinone. One of the Quinone carries two aziridines and two carbamate moieties which was so toxic to Glioblastoma.
The tumor stop growing and started shrinking. I named the Diaziridine Dicarbamate Quinone, AZQ. My major concern was how toxic this compound would be to the normal brain cells. Fortunately, brain cells do not divide, only cancer cells divide. AZQ acts as a Prodrug. A Prodrug is compound carrying a chemical by masking group that renders it inactive and nontoxic. Once the prodrug reaches a treatment site in the body, removing the mask frees the active drug to go only where it is needed, which helps avoid systemic side effects. To grow rapidly, cancer cells use Glucose as a source of energy. Glucose is broken down to produce Lactic acid. It is the acid which activates the aziridine and carbamate moieties generating Carbonium ions attacking Glioblastoma which stop growing and start shrinking. My drug AZQ is successful in treating experimental brain tumor because I rationally designed to attacks dividing DNA. Radio labeled studies showed that AZQ bind to the cancer cells DNA and destroy brain tumor and normal brain cells are not affected at all. AZQ is a new generation of drugs.
Not so long ago, these cancers mean death. Now, we have changed it from certain death to certain survival. The immunologists in our laboratories are developing new treatment technique by making radio labeled antigens to attack remaining cancer cells without harming normal cells. We have cured many forms of cancer. We have eliminated childhood leukemia, Hodgkin disease, testicular cancer and now AZQ type compounds which are being developed rationally. While most anti-cancer drugs such as Adriamycin, Mitomycin C, Bleomycin etc., in the market are selected after a random trial of thousands of chemicals by NCI, AZQ is rationally designed for attacking the DNA of cancer cells in the brain without harming the normal cells. We are testing combinations of these drugs to treat a variety of experimental cancers in animals [19-21]. In developing drugs for treatments, we poison bad DNA selectively. All poisons are a class of chemicals that attacks all DNA good and bad alike. Chemicals that cause cancer, at a safe level, can also cure cancer. Science teaches us to selectively attack bad sets of DNAs without harming the good sets of DNAs.
Poisons are injurious to living creatures. There is a small class of chemical, when exposed to humans, disrupt the function of DNAs, and make normal cells abnormal and they are called cancer causing chemicals or carcinogens. I must confess, we still use surgery to cut off a cancerous breast; we still burn cancer cells by radiations; and we still poison cancer cells by chemicals. The largest killer of women is breast cancer. After all the treatment, the remaining cancer cells return as metastatic cells and kill breast cancer patients in three years. A decade from now, these methods could be considered as brutal and savage, but today that is all we have. We hope to develop new treatment for Breast Cancer. Hopes means never ever to give up. As I said above, I rationally design drugs to treat Brain cancer. I am the discoverer of AZQ (US Patent No. 4,146,622 & 4,233,215). I shared a 17-year royalty with two of my colleagues. The discovery of AZQ has been a quarter century long effort starting from the Royal Cancer Hospital, University of London, England and ending in the National Cancer Institute, Washington, America.
Some may think that we are very lucky. The fact is that luck has nothing to do with it. It is a shear hard work. I had already made over one hundred derivatives of Aziridine drugs which tested against experimental animal’s tumors and published with Professor Ross before I came to America and joined NCI (National Cancer Institute). Let me share with you how we sweated for making AZQ. To introduce one successful drug for treating one kind of cancer, over the last 25-year period, I conducted over 500 experiments, out of which 200 drugs were tested in thousands of animals and only 45 drugs were considered valuable enough to be patented by US government and only one drug, AZQ, has recently undergoing extensive several Phase-III clinical trials which showed that patients receiving AZQ live 20 to 24 months longer than the untreated patients. This period gives physicians enough time to develop alternative treatment to eliminate the remaining resistant cancer cells by Immunotherapy. For the discovery of AZQ, I was honored with the “2004 NIH Scientific Achievement Award”, one of America’s highest awards in medicine (Figures 1-6).
Conclusion
The Impact of sequencing human genome on the origin of life is considered. Has all life always been on planet Earth as it is today? The answer is no. The sequencing of hundreds of living species showed that the complexity of life begins with simpler life form over millions of years. We have a common ancestor who came out of a Darwin’s warm little pond over four billion years ago, the proof of our common ancestor came from the sequencing the book of life of many species and comparing their genomes. We discovered that the book of life of all living creature from the tiny blade of grass to the mighty Elephant including Man, Mouse and Monkey is written using the same four genetic letters, that is the nucleotides, Adenine (A) Thiamine (T), Guanine (G) and Cytosine (C) and it is written in double helix. Life is simply a series of coordinated complex chemical reactions inscribed on the strings of the above four nucleotides called the DNA.
The proof of our evolution from the simpler to the more complex forms of life came from the sequencing of their DNA extracted from their fossils trapped in the ancient rocks. If you were to examine, geologic formation of the layers of rocks from Pre-Cambrian era to the present era, you will find no human fossils was discovered until you come to the three million old rocks. Johnson and his team found the first fossil of a bipedal chimp-human in a three and a half million old rocks found near the Haggar Valley in Ethiopia. These were the bones of an 18-year-old woman called Lucy. We have all descended from her. She was the mother of us all. The faster we learn this truth that you and I are brothers and sisters’ children of the same mother, a black woman who was born in Africa three and a half million years ago, the better it is for all of us then and only then men and women of different races, different religions and different nations will respect each other and treat each other like brothers and sisters and time begins now.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.