bacteria-02-human-genomeA staff of scientists from the University of Maryland School of Medicine has discovered the strongest proof but that micro organism sometimes switch their genes into the human genome, discovering bacterial DNA sequences in a few third of wholesome human genomes and in a far larger proportion of most cancers cells. The outcomes, printed right this moment (20 June) in PLOS Computational Biology, counsel that gene switch from micro organism to people isn’t solely potential, but additionally in some way linked to over-proliferation: both most cancers cells are inclined to those intrusions or the incoming bacterial genes assist to kick-begin the transformation from wholesome cells into cancerous ones.

“It actually does appear that the human genome sequence knowledge from somatic cells present indicators of LGT occasions from micro organism, and so do most cancers cells,” mentioned Jonathan Eisen from University of California, Davis, who coordinated the peer evaluation of the brand new research however was not concerned within the work. “Wild stuff does occur.”

The trillions of micro organism in our our bodies recurrently change DNA with one another, however the concept their genes may find yourself in human DNA has been very controversial. In 2001, the staff that sequenced the primary human genome claimed to have discovered 113 instances of such lateral gene transfers (LGT), however their conclusion was later refuted.

This excessive-profile error “had a chilling impact on the sphere,” in line with Julie Dunning Hotopp who led the brand new research. Although her crew has since discovered a number of instances of LGT between bacteria and invertebrates, “it’s nonetheless troublesome to persuade those who it could be taking place within the human genome,” she stated.

Rather than in search of bacterial genes that had grow to be everlasting elements of the human genome, Dunning Hotopp’s group looked for traces of microbial DNA in somatic cells—the cells of the physique that don’t type gametes.

Lab members David Riley and Karsten Sieber scanned publicly accessible knowledge from the one thousand Genomes Project and located greater than S,000 situations of LGT from micro organism, affecting round a 3rd of the folks they studied. When they analyzed sequences from the Cancer Genome Atlas, they found 691,000 extra situations of LGT ninety nine.N % of those got here from tumor samples moderately than regular tissues.

Acute myeloid leukaemia cells had been notably rife with bacterial sequences. A third of the microbial genes got here from a genus referred to as Acinetobacter, and had been inserted into the mitochondrial genome.

Stomach most cancers cells additionally contained numerous bacterial DNA, particularly from Pseudomonas. Most of this DNA had been inserted into 5 genes, 4 of which have been already identified to be proto-oncogenes that can provide rise to most cancers, emphasizing a potential hyperlink between LGT and cancerous progress. “Finding these integrations in a number of people, in addition to within the proto-oncogenes, actually spoke to how important this could be,” mentioned Dunning Hotopp.

“We know already that a important proportion of cancers are attributable to insertion of genetic materials from viruses,” stated Etienne Danchin from the French National Institute for Agricultural Research, who reviewed the paper. “But that is the primary time, so far as I know, that HGT from micro organism might be suspected as a reason for most cancers.”

However, Dunning Hotopp could be very clear that her outcomes inform us nothing about whether or not the inserted bacterial DNA contributed to inflicting the cancers, or had been simply alongside for the trip. To get on the query of causation, researchers might intentionally add bacterial DNA into the identical websites inside human cell traces to see in the event that they flip cancerous, she mentioned. But even when the bacterial LGT can provoke over-proliferation, it might be laborious to stop such transfers with antibiotics. “You don’t know when these transfers happen, and you’ll’t give individuals antibiotics their complete life,” stated Dunning Hotopp. “A vaccine could be good, however that’s assuming these are causative.”

“LGT is extremely essential in evolution however many claims of particular instances of LGT have been significantly flawed,” mentioned Eisen. “I got here into this as a severe skeptic. It simply appeared so unbelievable.”

But the workforce received him over. They ran an in depth set of checks to ensure that these bacterial sequences weren’t laboratory artifacts and had not come from contaminating microbes.

For instance, they confirmed that LGT was extra frequent in most cancers cells than wholesome tissue, and two out of ten most cancers varieties had been notably exhausting hit. If the bacterial integrations have been artifacts of the methodology, it must be equally frequent in any tissue pattern. The staff additionally centered on sequences with excessive protection—that’s, these which had been learn many occasions over. When the group discovered proof of LGT, it was constant throughout all of those reads. “In the tip, the authors addressed each single query that I and the reviewers raised,” mentioned Eisen.

Hank Seifert from Northwestern University, who was not concerned within the research, stays cautious. “This paper could be very attention-grabbing and doubtlessly essential,” he mentioned. “However, till the direct evaluation of particular tumor cells will be carried out to validate that these are actual occasions, this work [is] nonetheless speculative.”

But Dunning Hotopp’s group can’t do these validation research herself. For privateness causes, they can’t entry the unique tumor samples that their knowledge got here from. “People with entry to the samples must validate that the integrations are appropriate,” she mentioned.

Danchin agrees that the outcomes must be validated however mentioned, “I am personally satisfied what they’ve discovered by screening the totally different databases is true. I suppose LGT occurs far more continuously than we think about however, more often than not, is simply not detectable.”

May 27th, 2014

Posted In: Uncategorized

Tags: , ,

Leave a Comment

DNA is a polymer. The monomer units of DNA are nucleotides, and the polymer is known as a “polynucleotide.” Each DNA nucleotide consists of 3 components 1) 5-carbon sugar (deoxyribose), 2) a nitrogen containing base attached to the sugar, 3) and a phosphate group.

There are four different types of nucleotides found in DNA, differing only in the nitrogenous base (Nucleobases). The nucleobases are adenine (A), guanine (G), cytosine (C), and thymine (T). The sugars and phosphates of the nucleotides bond strongly together to form a “backbone” of the double helix to which these four bases connect, forming the “rungs”.

The skeleton of adenine and guanine is purine, hence the name purine-bases. A purine has 9 atoms that make up the fused rings (5 carbon, 4 nitrogen). The skeleton of cytosine and thymine is pyrimidine, hence pyrimidine-bases. A pyrimidine has 6 atoms (4 carbon, 2 nitrogen). All ring atoms of both purines and pyrimidines lie in the same plane.


Within the DNA double helix, A forms 2 hydrogen bonds with T on the opposite strand, and G forms 3 hydrogen bonds with C on the opposite strand. dA-dT and dG-dC base pairs are the same length, and occupy the same space within a DNA double helix with uniform diameter. dA-dT and dG-dC base pairs can occur in any order within DNA molecules.

This simplicity is useful when the DNA replicates. The enzyme helicase triggers the unwinding and opening up of the double helix structure. Another enzyme, DNA polymerase, matches up each newly unbonded base with its complementary base. When replication is complete, there are two identical copies of the original DNA molecule. As hydrogen bonds are not covalent, they can be broken and rejoined easily. The two strands of DNA in a double helix can therefore be pulled apart like a zipper. This process called melting forms two single-stranded DNA molecules (ssDNA) molecules. Melting occurs at high temperature, low salt and high pH (low pH also melts DNA, but DNA becomes unstable due to acid depurination, so low pH is rarely used.

The stability of the double stranded DNA form depends not only on the GC-content (% GC basepairs) but also on sequence and also length (longer molecules are more stable). Long DNA helices with a high GC-content have stronger-interacting strands, while short helices with high AT content have weaker-interacting strands. In the laboratory, the strength of this interaction can be measured by finding the temperature necessary to break the hydrogen bonds or their melting temperature (also called Tm value).

ANALOGUES: The most common application of nucleobase analogues are fluorescent probes. In medicine, they are used as anticancer and antiviral agents.

The information in DNA is stored as a genetic code made up of these four nucleobases. Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same across populations. The order, or sequences, of these bases are the building blocks of an organism.

January 17th, 2014

Posted In: Uncategorized

Tags: , , ,

Leave a Comment

Watson and Crick’s discovery of the DNA Double Helix structure in 1953 sprang the scientific community into life. With a few years attempts were being made to replicate the DNA sequence. By knowing the sequence of the nucleotides in the DNA, the location of the genes and mutations across populations in different species would be known. In 1970’s, two methods were independently developed to sequence DNA. The American team, lead by Maxam and Gilbert, used a “chemical cleavage protocol”, while the English team, lead by Sanger, developed the “chain termination method”. Even though both teams shared the 1980 Nobel Prize, Sanger’s method has survived the test of practicality and became the gold standard.


Sanger Method utilizes 2′, 3’-dideoxynucleotide triphospates (ddNTPs). ddNTPs are different from deoxynucleotides (dNTP) by the having a hydrogen atom at to the 3′ carbon instead of a hydroxyl group. ddNTPs terminate DNA chain elongation because they cannot form a phosphodiester bond with the next dNTP.

The double stranded DNA sequence is converted to a single stranded DNA sequence by denaturing the double stranded DNA with NaOH. Sanger method requires a single stranded DNA to be sequenced, DNA polymerase, DNA primers, a mixture of ddNTP with its normal dNTP and the other three dNTPs (dCTP, dGTP, and dTTP).  For example, “G” tube contains all four dNTP’s, ddGTP and DNA polymerase, “A” tube: all four dNTP’s, ddATP and DNA polymerase and so on. The concentration of ddNTP is 1% of the concentration of dNTP. When the DNA polymerase is added, the polymerization will take place and will terminate whenever a ddNTP is incorporated into the growing strand. Once the reactions in all the 4 tubes are complete, the DNA is denatured for electrophoresis. The contents of each of the four tubes are run on separate lanes on a polyacrylmide gel. Then the gel is then exposed to X-Ray for reading.  Smaller fragments travel faster and farther than the larger molecules due to their light molecular weight. When the reactions from the four tubes are combined on one gel plate and the bases are read from the bottom up, 5′ to 3′ sequence of the strand complementary to the sequenced strand can be read.


Automated Sequence: Sanger’s single radioactive dye where replaced with four fluorescent dyes for the 4 ddNTPs and the reactions are performed in a single tube containing all four ddNTPs, each labeled with a different color dye. Then the gel electrophoresis is done on a single lane instead of four and the DNA sequence is read using a chromatogram.

Microfluidic Sanger sequencing: Here the sequential steps from Sanger method are integrated on a wafer-scale chip using nanoliter-scale sample volumes. It generates long and accurate sequences, while overcoming many shortcomings of the conventional Sanger method.

Capillary Sanger sequence: It eliminated the use of gels. Instead, semi-liquid polymer was injected into the capillary before each run. Automated injection of samples allows hands-off operation through runs and amount of DNA sample required is reduced. Each run requires less than 3 hours.

The Human Genome Project: The 13 year long project of decoding the human genome was completed in 2003. Sanger sequencing technique was one of the methods used to sequence the DNA.

January 17th, 2014

Posted In: Uncategorized

Tags: , , ,

Leave a Comment

With Sanger based sequencing techniques, scientists began decoding genetic sequences from a wide array of species. Even though this was an exemplary development, the limitations of the existing methods soon become a hindrance. Next Generation Sequencing was developed to overcome these limitations and revolutionize the progress in genomic science. The idea behind Next Gen Sequencing technology is similar to Sanger based methods: the bases of a small fragment of DNA are sequentially identified after each fragment is re-synthesized from a DNA template strand. But NGS can process many reactions simultaneously instead of single or few at a time. If  NGS had been developed before the Human Genome Project, the human DNA would have been decoded in about a week for a few thousand dollars instead of 13 years and 13 billion dollars.

Applications of  Next Generation Sequencing: It is used for whole gene sequencing, target genome sequencing, resequencing, target enrichment, gene regulation, transcription analysis, epigenetic changes, metagenomics, paleogenomics etc.

Whole genome sequencing: Till few years ago, sequencing the entire genome was arduous and time consuming. With Next Generation Sequencing, large genomes can be sequenced in few days. Sequencing of genomes what have never been sequenced before (de novo sequencing) pose an interesting challenge. They have to be assembled without aligning to a reference sequence. A problem with de novo sequencing is that the short read lengths generated by NGS can lead to higher number of gaps and regions where no reads align, resulting in greater fragmentation and smaller continuous sequences which makes poorer data quality. This is usually seen in regions of the genome containing repetitive sequence elements.

Targeted Sequencing: In this technique, only the required genes or defined regions in a genome are sequenced. This approach is used to sequence large number of individuals to discover, screen and validate genetic variation within a population and to identify rare genetic variants. The two methods for making libraries for targeted sequencing projects are target enrichment and amplicon sequencing.

Next-Gen Methods:

  1. Single-molecule real-time sequencing (SMRT) is based on the sequencing by synthesis approach and allows detection of nucleotide modifications such as cytosine methylation.
  2. Ion semiconductor (Ion Torrent sequencing) has a semiconductor based detection system to the hydrogen ions that are released during the polymerization of DNA, instead of optical methods used in other methods.
  3. Pyrosequencing (454) amplifies DNA inside water droplets in an oil solution (emulsion PCR), with each droplet containing a single DNA template attached to a single primer-coated bead that forms a clonal colony. The machine contains many picoliter-volume wells each with a single bead and sequencing enzymes. Pyrosequencing uses luciferase to generate light for detection of the nucleotides.
  4. Sequencing by synthesis (Illumina) is based on reversible dye-terminators technology and engineered polymerases. DNA molecules and primers are attached on a slide and amplified with polymerase to form “DNA clusters”. Then, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the fluorescently labeled nucleotides.
  5. Sequencing by ligation (SOLiD sequencing) oligonucleotides of a fixed length are labeled according to the sequenced position and then annealed and ligated by DNA ligase. The DNA is amplified by emulsion PCR. The resulting beads, each containing single copies of the same DNA molecule, are deposited on a glass slide.


January 17th, 2014

Posted In: Uncategorized

Tags: , , , , , ,

Leave a Comment

Illumina sequencing is one of the new generation techniques of DNA sequencing. It was developed by researchers at Manteia Predictive Medicine which was acquired by Solexa in 2004 and Solexa, was later acquired by Illumina. Hence the name, Illumina sequencing. This sequencing method is based on using reversible dye-terminators that enable the identification of single bases as they are introduced into DNA strands. The use of clonal arrays and massively parallel sequencing of short reads using reversible terminators was subsequently referred to as sequencing by synthesis technology or SBS.

This method is often used to sequence difficult regions, such as homopolymers and repetitive sequences. It can also be used for whole-genome sequencing, region targeted sequencing, de novo sequencing transcriptome analysis, small RNA analysis, methylation profiling, and genome-wide protein-nucleic acid interaction analysis.


DNA molecules are first attached to primers on a slide and the clones are amplified by bridge amplification so that local colonies of DNA or “DNA clusters” are formed. This cluster generation technology had been invented and developed by Dr Pascal Mayer and Dr Laurent Farinelli in 1996 at Glaxo-Welcome’s Geneva Biomedical Research Institute (GBRI).

The DNA templates are sequenced base by base, in parallel, using four types (adenine, cytosine, guanine, and thymine) of reversible terminate bases (RT bases). These bases are fluorescently labeled with a different color and attached with a terminal 3’ blocking agent. The four bases then compete for binding sites on the template DNA to be sequenced and unbound nucleotides are washed away. This natural competition ensures the highest accuracy. After each synthesis a laser is used to excite the clusters resulting in the removal of the 3’ terminal blocking group and the probe. The fluorescent color specific to one of the four bases is then visible, allowing for sequence identification and the beginning of the next cycle. The process is repeated until the full DNA molecule is sequenced. This technique allows for very large arrays of DNA colonies to be captured by sequential images taken from a single camera.

This technique offers a number of advantages over traditional sequencing methods such as Sanger sequencing. Due to the automated nature of Illumina dye sequencing it is possible to sequence multiple strands at once and gain actual sequencing data quickly. Additionally, this method only uses DNA polymerase as opposed to multiple, expensive enzymes required by other sequencing techniques like Pyrosequencing.

Illumina sequencing has been used to research transcriptomes of the sweet potato and the gymnosperm genus Taxus and research in oncology. Illumina unveiled its HiSeq X (pronounced “High Seek 10”) on January 14th, 2014. This is the world’s first DNA-crunching supercomputer designed to process 20,000 genomes per year at a cost of $1,000 each. Previously it cost about $10,000 to sequence a human genome.

January 17th, 2014

Posted In: Uncategorized

Tags: , , ,

Leave a Comment

The first and major step in genomic research that the scientific community hopes to achieve is invention of reliable and advanced technology that will enable completion of the Human Genome project. This is so that they can finally understand the complete genetic composition and function of the human being. Next would be to establish a reliable operation of public data repositories and address the issues the scientific community faces given the current overwhelming increase in genetic data output. This should include allowance of the submission of differentiated treatment of DNA sequences for archiving. If correct information on sequences is stored reliably and efficiently over the internet and scientists all over the world are able to access it, then this will lead to major contributions towards genomic research. Research focused on areas such as genomic sequences of key mammals, comprehensive collections of knockouts and knockdowns of all genes in selected animals to accelerate the development of models of disease, cohort populations for studies designed to identify genetic contributors to health and to assess the effect of individual gene variants on disease risk, reference sets of proteins from key species in various formats and reference sets of coding sequences from key species will be performed sooner than anticipated.

Improvements in genomic research has become important now more than ever considering the increase in numbers of infectious and non-infectious diseases. Therefore, all societies, including those in the less developed countries must be encouraged to invest in large-scale human genome variation studies in order to better understand genetic variants for all human races with possible regard to the environmental factors. As science continues to advance, scientists must also be encouraged to develop tools that are clinically applicable to genetic research such as risk prediction, diagnosis or therapeutic interventions. Growth in this area is critical for the understanding and eradication of non-communicable diseases because it will lead to identifying of rare and novel genetic variants associated with the diseases and their risk factors. It will also lead to development of genome-based strategies for early detection, diagnosis and treatment of various diseases. Identification of single nucleotide polymorphisms will also identify variations that contribute towards a person’s response towards certain drugs. This will contribute towards development of Pharmacogenetics (the field of science that deals with how genes affect a person’s response to drugs).

Lastly, exploring the ethical, legal and social issues that affect genomic research is also crucial. Steps should be taken to ensure that these issues do not present as obstacles towards significant developments in the area of genomics. Deriving meaningful knowledge of the human genome is the ultimate agenda in genomic research. Therefore, as we acknowledge the remarkable achievements in the field of genomic research such as the International HapMap project, ENCODE and Human Genome project, it is crucial to emphasize that a lot of information on genetic composition, gene expression and functions is still unknown and steps to help in advancement of this area of research are an absolute necessity.

January 16th, 2014

Posted In: Uncategorized

Leave a Comment