Watson and Crick’s discovery of the DNA Double Helix structure in 1953 sprang the scientific community into life. With a few years attempts were being made to replicate the DNA sequence. By knowing the sequence of the nucleotides in the DNA, the location of the genes and mutations across populations in different species would be known. In 1970’s, two methods were independently developed to sequence DNA. The American team, lead by Maxam and Gilbert, used a “chemical cleavage protocol”, while the English team, lead by Sanger, developed the “chain termination method”. Even though both teams shared the 1980 Nobel Prize, Sanger’s method has survived the test of practicality and became the gold standard.

METHOD:

Sanger Method utilizes 2′, 3’-dideoxynucleotide triphospates (ddNTPs). ddNTPs are different from deoxynucleotides (dNTP) by the having a hydrogen atom at to the 3′ carbon instead of a hydroxyl group. ddNTPs terminate DNA chain elongation because they cannot form a phosphodiester bond with the next dNTP.

The double stranded DNA sequence is converted to a single stranded DNA sequence by denaturing the double stranded DNA with NaOH. Sanger method requires a single stranded DNA to be sequenced, DNA polymerase, DNA primers, a mixture of ddNTP with its normal dNTP and the other three dNTPs (dCTP, dGTP, and dTTP).  For example, “G” tube contains all four dNTP’s, ddGTP and DNA polymerase, “A” tube: all four dNTP’s, ddATP and DNA polymerase and so on. The concentration of ddNTP is 1% of the concentration of dNTP. When the DNA polymerase is added, the polymerization will take place and will terminate whenever a ddNTP is incorporated into the growing strand. Once the reactions in all the 4 tubes are complete, the DNA is denatured for electrophoresis. The contents of each of the four tubes are run on separate lanes on a polyacrylmide gel. Then the gel is then exposed to X-Ray for reading.  Smaller fragments travel faster and farther than the larger molecules due to their light molecular weight. When the reactions from the four tubes are combined on one gel plate and the bases are read from the bottom up, 5′ to 3′ sequence of the strand complementary to the sequenced strand can be read.

ADVANCES:

Automated Sequence: Sanger’s single radioactive dye where replaced with four fluorescent dyes for the 4 ddNTPs and the reactions are performed in a single tube containing all four ddNTPs, each labeled with a different color dye. Then the gel electrophoresis is done on a single lane instead of four and the DNA sequence is read using a chromatogram.

Microfluidic Sanger sequencing: Here the sequential steps from Sanger method are integrated on a wafer-scale chip using nanoliter-scale sample volumes. It generates long and accurate sequences, while overcoming many shortcomings of the conventional Sanger method.

Capillary Sanger sequence: It eliminated the use of gels. Instead, semi-liquid polymer was injected into the capillary before each run. Automated injection of samples allows hands-off operation through runs and amount of DNA sample required is reduced. Each run requires less than 3 hours.

The Human Genome Project: The 13 year long project of decoding the human genome was completed in 2003. Sanger sequencing technique was one of the methods used to sequence the DNA.

January 17th, 2014

Posted In: Uncategorized

Tags: , , ,

Leave a Comment

With Sanger based sequencing techniques, scientists began decoding genetic sequences from a wide array of species. Even though this was an exemplary development, the limitations of the existing methods soon become a hindrance. Next Generation Sequencing was developed to overcome these limitations and revolutionize the progress in genomic science. The idea behind Next Gen Sequencing technology is similar to Sanger based methods: the bases of a small fragment of DNA are sequentially identified after each fragment is re-synthesized from a DNA template strand. But NGS can process many reactions simultaneously instead of single or few at a time. If  NGS had been developed before the Human Genome Project, the human DNA would have been decoded in about a week for a few thousand dollars instead of 13 years and 13 billion dollars.

Applications of  Next Generation Sequencing: It is used for whole gene sequencing, target genome sequencing, resequencing, target enrichment, gene regulation, transcription analysis, epigenetic changes, metagenomics, paleogenomics etc.

Whole genome sequencing: Till few years ago, sequencing the entire genome was arduous and time consuming. With Next Generation Sequencing, large genomes can be sequenced in few days. Sequencing of genomes what have never been sequenced before (de novo sequencing) pose an interesting challenge. They have to be assembled without aligning to a reference sequence. A problem with de novo sequencing is that the short read lengths generated by NGS can lead to higher number of gaps and regions where no reads align, resulting in greater fragmentation and smaller continuous sequences which makes poorer data quality. This is usually seen in regions of the genome containing repetitive sequence elements.

Targeted Sequencing: In this technique, only the required genes or defined regions in a genome are sequenced. This approach is used to sequence large number of individuals to discover, screen and validate genetic variation within a population and to identify rare genetic variants. The two methods for making libraries for targeted sequencing projects are target enrichment and amplicon sequencing.

Next-Gen Methods:

  1. Single-molecule real-time sequencing (SMRT) is based on the sequencing by synthesis approach and allows detection of nucleotide modifications such as cytosine methylation.
  2. Ion semiconductor (Ion Torrent sequencing) has a semiconductor based detection system to the hydrogen ions that are released during the polymerization of DNA, instead of optical methods used in other methods.
  3. Pyrosequencing (454) amplifies DNA inside water droplets in an oil solution (emulsion PCR), with each droplet containing a single DNA template attached to a single primer-coated bead that forms a clonal colony. The machine contains many picoliter-volume wells each with a single bead and sequencing enzymes. Pyrosequencing uses luciferase to generate light for detection of the nucleotides.
  4. Sequencing by synthesis (Illumina) is based on reversible dye-terminators technology and engineered polymerases. DNA molecules and primers are attached on a slide and amplified with polymerase to form “DNA clusters”. Then, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the fluorescently labeled nucleotides.
  5. Sequencing by ligation (SOLiD sequencing) oligonucleotides of a fixed length are labeled according to the sequenced position and then annealed and ligated by DNA ligase. The DNA is amplified by emulsion PCR. The resulting beads, each containing single copies of the same DNA molecule, are deposited on a glass slide.

 

January 17th, 2014

Posted In: Uncategorized

Tags: , , , , , ,

Leave a Comment

Illumina sequencing is one of the new generation techniques of DNA sequencing. It was developed by researchers at Manteia Predictive Medicine which was acquired by Solexa in 2004 and Solexa, was later acquired by Illumina. Hence the name, Illumina sequencing. This sequencing method is based on using reversible dye-terminators that enable the identification of single bases as they are introduced into DNA strands. The use of clonal arrays and massively parallel sequencing of short reads using reversible terminators was subsequently referred to as sequencing by synthesis technology or SBS.

This method is often used to sequence difficult regions, such as homopolymers and repetitive sequences. It can also be used for whole-genome sequencing, region targeted sequencing, de novo sequencing transcriptome analysis, small RNA analysis, methylation profiling, and genome-wide protein-nucleic acid interaction analysis.

Method:

DNA molecules are first attached to primers on a slide and the clones are amplified by bridge amplification so that local colonies of DNA or “DNA clusters” are formed. This cluster generation technology had been invented and developed by Dr Pascal Mayer and Dr Laurent Farinelli in 1996 at Glaxo-Welcome’s Geneva Biomedical Research Institute (GBRI).

The DNA templates are sequenced base by base, in parallel, using four types (adenine, cytosine, guanine, and thymine) of reversible terminate bases (RT bases). These bases are fluorescently labeled with a different color and attached with a terminal 3’ blocking agent. The four bases then compete for binding sites on the template DNA to be sequenced and unbound nucleotides are washed away. This natural competition ensures the highest accuracy. After each synthesis a laser is used to excite the clusters resulting in the removal of the 3’ terminal blocking group and the probe. The fluorescent color specific to one of the four bases is then visible, allowing for sequence identification and the beginning of the next cycle. The process is repeated until the full DNA molecule is sequenced. This technique allows for very large arrays of DNA colonies to be captured by sequential images taken from a single camera.

This technique offers a number of advantages over traditional sequencing methods such as Sanger sequencing. Due to the automated nature of Illumina dye sequencing it is possible to sequence multiple strands at once and gain actual sequencing data quickly. Additionally, this method only uses DNA polymerase as opposed to multiple, expensive enzymes required by other sequencing techniques like Pyrosequencing.

Illumina sequencing has been used to research transcriptomes of the sweet potato and the gymnosperm genus Taxus and research in oncology. Illumina unveiled its HiSeq X (pronounced “High Seek 10”) on January 14th, 2014. This is the world’s first DNA-crunching supercomputer designed to process 20,000 genomes per year at a cost of $1,000 each. Previously it cost about $10,000 to sequence a human genome.

January 17th, 2014

Posted In: Uncategorized

Tags: , , ,

Leave a Comment

What is DNA Deoxyribonucleic acid, which is commonly known by its acronym DNA, is the hereditary material in humans and almost all other living organisms. It was discovered by Friedrich Meischer in 1869. It is a linear polymer that is located in the cell nucleus (nuclear DNA) and is protected by the nuclear envelope. However, some of it can be found in the mitochondria (mitochondrial DNA). Nearly every cell in the human body has the same kind of DNA. Its function is to encode information. It encodes all of the sequences of all of the proteins that an organism needs to live. It is an extremely long polymer therefore a single molecule of it encodes all of the information necessary to produce thousands of proteins. Information in DNA is stored as a code made up of four chemical bases: the purine bases, adenine (A) and guanine (G) and the pyrimidine bases, cytosine (C) and thymine (T). It may contain more than 200 million nucleotide base and more than 99% of all these bases are the same in all human beings. DNA bases pair up with each other, A with T and C with G, to form units called base pairs. Each base is also attached to a sugar molecule and a phosphate molecule. Together, a base, sugar, and phosphate are called a nucleotide. Nucleotides are arranged in two long strands that form a spiral called a double helix. The complementarity of the two strands with strict adherence to the A to T and G to C pairing rules – is the basis for replication of the genes. The two strands separate during replication, and each serves as a template for the synthesis of a new complementary strand.

DNA-HELIX-DETAIL
James Watson, Francis Crick and also Rosalind Franklin are the scientists who discovered that DNA was the carrier of the information needed to make all the proteins in the body. They formulated the ‘central dogma’ that paved way to modern molecular biology. It stated that DNA can be copied to make more DNA, copying all the information in the original. The information in DNA can also be transcribed into RNA, which then directs the production of a protein whose sequence of amino acids is determined by the sequence of nucleotides in the DNA (and RNA). Each strand of DNA in the double helix can serve as a pattern for duplicating the sequence of bases. The information flow is always either DNA to DNA or DNA to RNA to protein.

 

 

Mitochondrial DNA also plays a very important part in this information flow. It contains thirty seven genes, all of which are essential for normal mitochondrial function. Thirteen of these genes provide instructions for making enzymes involved in oxidative phosphorylation. The twenty four remaining genes provide instructions for making molecules called transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs), which are chemically related to DNA. These types of RNA help assemble protein building blocks (amino acids) into functioning proteins.
The discovery of DNA function and structure is one of the most important contributions to science and has enabled scientists and clinicians to better understand the physiology of the human body.

January 16th, 2014

Posted In: Handbook

Tags: , , , , , , , ,

Leave a Comment

Proteins are very large complex molecules (macromolecules) that have a specific function and sometimes even more than one function. They play a very important role in all aspects of cell structure and function. Although they are complex, they have a simple underlying structure that does not form branches or circles. This is the reason why they are referred to as linear polymers. The simple units that make up proteins are known as amino acids and most organisms have about 20 different kinds. These 20 naturally occurring amino acids are grouped according to criteria such as hydrophobicity, size, aromaticity or charge. They include Glycine (Gly / G), Alanine (Ala / A), Valine (Val / V), Phenylalanine (Phe / F), Proline ( Pro / P), Isoleucine (Ile / I), Leucine (Leu / L), Methionine (Met / M), Aspartic acid / Aspartate (Asp / D), Glutamic acid / Glutamate (Glu / E), Lysine (Lys / K), Arginine (Arg / R), Serine (Ser / S), Threonine (Thr / T),Tyrosine (Tyr / Y), Histidine (His / H), Cysteine (Cys / C), Asparagine (Asn / N), Glutamine (Glu / Q) and Tryptophan (Trp / W). As shown above, they are commonly identified by their three letter abbreviations or one letter symbol. A typical protein, however, is a string of several hundred amino acids. The term amino acid refers to any molecule containing both an amino group and any type of acid group. Thus all amino acids (*except Proline) contain similar structural features such as the amino group, carboxyl group and an alpha carbon. The arrangement in which the amino acids make up a protein chain is known as a protein sequence. The amino acids within these chains are linked by amide bonds and the chains are referred to as peptide chains. These sequences are flexible and determine the characteristic and functional capabilities of the protein.

Proteins perform specific activities in different forms in our body such as enzymes, hormones, antibodies, haemoglobin (blood) and growth and maintenance proteins. Collagen is the most prevalent protein in human beings. It forms strong sheets that support skin, internal organs and tendons as well as the hard substance that gives shape to the nose and ears. It is one of the largest proteins in the body but it is made up mostly of the same three amino acids that keep repeating over and over again. Proteins can be classified by their functions; structural proteins, enzymatic proteins, transport proteins, contractile proteins, protective proteins, hormonal proteins and toxins. However, sometimes different proteins form stable complexes that work together as a group to perform a specific function. Various techniques have been invented to study the structure and functions of proteins and these include mass spectrometry, nuclear magnetic resonance among others. This is normally done in a modern laboratory. The study of proteins is one of the most important branches of science and there is no clear division between the organic chemistry of proteins and their biochemistry

January 16th, 2014

Posted In: Handbook

Tags: , , , , , , , , ,

Leave a Comment

The gene is the basic functional unit of heredity. Genes are made up of deoxyribonycleic acid (DNA) or in the case of some viruses they are made up of ribonucleic acid (RNA). They contain particular set of instructions as coding for specific proteins or function. They determine an observable trait or characteristic of an organism and the DNA sequence that determines the chemical structure of a specific polypeptide molecule or RNA molecule. The word ‘gene’ was invented by W. Johannsen in 1909 but the modern concept of the gene originated with Gregor Mendel, who in the 1860s studied the inheritance of characteristics that differed sharply and unambiguously among true-breeding varieties of garden peas. Mendel found that a hybrid between two phenotypically distinct varieties resembled one of the two parents – the dominant parent.

In humans, genes vary in size but the Human Genome Project estimated that humans have between 20,000 and 25,000 genes. The entire DNA in the cell makes up the human genome and every person has two copies of each gene, one inherited from each parent. The DNA in the genes makes up only 2% of the genome and most genes are the same in all people but a small number are slightly different between people. Genes contain hundreds of thousands of chemical bases and alleles are forms of the same gene with small differences in their sequence. These small differences contribute to each person’s unique physical features (phenotype). Even though a trait may not be observable, its gene can still be passed on to the next generation. This is known as a recessive gene. A gene that is always exhibiting a trait from one generation to another without suppression is known as a dominant gene. However, some genes are referred to as non-coding genes because they do not appear to contain information that the cells can use and produce. Contrary to popular belief, they are not functionless but their roles are slowly being discovered.

The idea that genes are also responsible for the manufacture of proteins was first proposed in 1902 by Sir Archibald Garrod, who realized that alkaptonuria was an inherited metabolic condition in humans and hypothesized that it was due to the absence of an enzyme (a catalytic protein) required for the breakdown of homogentisic acid. Systematic investigation of the relationship between genes and enzymes did not occur until 1941. It was later realised that in order to make proteins, the gene is copied by each of the chemical bases into the messenger ribonucleic acid (mRNA). The mRNA moves out of the cell nucleus and uses ribosomes to form the polypeptide that configures to form the protein.

All information about the sequences and genes discovered in the human body are carefully recorded and the all the information is placed in a database that is publicly available. Some DNA that have not been sequenced are also available and any scientist can sequence it and post the findings in the same database from anywhere around the world.

January 16th, 2014

Posted In: Handbook

Tags: , , ,

Leave a Comment

The first and major step in genomic research that the scientific community hopes to achieve is invention of reliable and advanced technology that will enable completion of the Human Genome project. This is so that they can finally understand the complete genetic composition and function of the human being. Next would be to establish a reliable operation of public data repositories and address the issues the scientific community faces given the current overwhelming increase in genetic data output. This should include allowance of the submission of differentiated treatment of DNA sequences for archiving. If correct information on sequences is stored reliably and efficiently over the internet and scientists all over the world are able to access it, then this will lead to major contributions towards genomic research. Research focused on areas such as genomic sequences of key mammals, comprehensive collections of knockouts and knockdowns of all genes in selected animals to accelerate the development of models of disease, cohort populations for studies designed to identify genetic contributors to health and to assess the effect of individual gene variants on disease risk, reference sets of proteins from key species in various formats and reference sets of coding sequences from key species will be performed sooner than anticipated.

Improvements in genomic research has become important now more than ever considering the increase in numbers of infectious and non-infectious diseases. Therefore, all societies, including those in the less developed countries must be encouraged to invest in large-scale human genome variation studies in order to better understand genetic variants for all human races with possible regard to the environmental factors. As science continues to advance, scientists must also be encouraged to develop tools that are clinically applicable to genetic research such as risk prediction, diagnosis or therapeutic interventions. Growth in this area is critical for the understanding and eradication of non-communicable diseases because it will lead to identifying of rare and novel genetic variants associated with the diseases and their risk factors. It will also lead to development of genome-based strategies for early detection, diagnosis and treatment of various diseases. Identification of single nucleotide polymorphisms will also identify variations that contribute towards a person’s response towards certain drugs. This will contribute towards development of Pharmacogenetics (the field of science that deals with how genes affect a person’s response to drugs).

Lastly, exploring the ethical, legal and social issues that affect genomic research is also crucial. Steps should be taken to ensure that these issues do not present as obstacles towards significant developments in the area of genomics. Deriving meaningful knowledge of the human genome is the ultimate agenda in genomic research. Therefore, as we acknowledge the remarkable achievements in the field of genomic research such as the International HapMap project, ENCODE and Human Genome project, it is crucial to emphasize that a lot of information on genetic composition, gene expression and functions is still unknown and steps to help in advancement of this area of research are an absolute necessity.

January 16th, 2014

Posted In: Uncategorized

Leave a Comment

ideograma-01

 

To find the position of specific gene, we should first draw a map to locate it, This map can be painted using chemicals to stain chromosomes and then we can see how distinctive patterns appears, this method is called Cytogenetic Location. There is also other way to make this map, a Molecular Location method, it is a sequence of DNA building blocks to precise locate a gene on a chromosome.

Cytogenetic gen location

A particular band on a stained chromosome normally indicates the position of a gene´s cytogenetic location, for example:
17q12
A range of bands can also indicate a gene location when the exact location is less know.
17q12 – q21

An gene´s “address” is written whit letters and numbers and have many parts  that help describe it.

· The first part is a letter or a number that indicates the Chromosome number where the gene can be found. from 1 to 22 (the autosomes) or X or Y for sex Chromosomes.

· The second part is the arm where the gene is located, each chromosome has two arms, one longer arm that is called q and the short one is called p, so

if the gene is located in chromosome 1 long arm, we write 1q and if it is located in the short arm of the sex chromosome we write Xp.

· The place where the two arms of the chromosome are narrowed is called the centromere and the position of the gene increase from this point if the stained band (dark or light) is far from it. Example 14q21 is closer to the centromere that 14q22.

When a gene is located almost in the centromere or if it is very close to the end of the chromosome, there are abbreviations like “cen” when is near centromere and   “ter” when is at the end.

 

Molecular gen location

There are many forms and methods to interpret the sequence of the human genome to find a gene’s molecular address. A sequence of base pairs of each chromosome determinate by the genome project, (an international research effort completed in 2003)  is used to locate genes, hence this method is more precise it also gives small variations on the results.

January 14th, 2014

Posted In: Handbook

Tags: , , ,

Leave a Comment

« Previous Page