Lab of Postgenomic Research in Biology

People

People

Head of the Laboratory
Kostryukova Elena Sergeevna, Ph.D
Elena Kostryukova

Akopian TA Senior Researcher, Ph.D
Semashko TA Researcher, Ph.D
Larin AK, Junior Researcher
Ospanova EA Junior Researcher
Karpova IY Junior Researcher
Chukin MM Chief engineer
Selezneva OV assistant researcher
Babenko VV Ph.D-student
Gornostaeva AS Ph.D-student

Topics of interest and lines of research

Topics of interest and lines of research

  • Obtaining and analysis of complete genome sequences.
  • Genome-wide transcriptional analysis of prokaryotic and eukaryotic organisms.
  • Metagenomic projects on the investigation of the community structure and metabolic reconstruction of microbiocenosis, including the examination of specific features of symbiotic/parasitic colonization of different parts of the body.

Equipment

The automated Sanger sequencing based on capillary electrophoresis

lpgi01ABI Prism 3100 Genetic Analyzer (Applied Biosystems, USA) is a fluorescence-based DNA-sequence analysis system utilizing the proven technology of capillary electrophoresis. The device has 16 capillaries operating in parallel; the length of each of them is 50 cm. It is used 96-well plate format. For 2,5 hours (time of a full run) 16 DNA samples are sequenced with 650 bp of an average length of reads. The total output of system is 144 samples per day.

lpgi02ABI Prism 3730XL DNA Analyzer (Applied Biosystems, USA) is a fluorescence-based DNA-sequence analysis system utilizing the proven technology of capillary electrophoresis. The device has 96 capillaries operating in parallel; the length of each of them is 50 cm. It is used both 96- and 384-well microtiter plate formats. The constant presence of the operator is not required for working the device. For 2,5 hours (time of a full run) 96 DNA samples are sequenced with 850 bp of an average length of reads. The total output of system is 864 samples per day.

The automated Sanger sequencing based on capillary electrophoresis; sample preparation

lpgi03

The robotic Colony picker QPix2 (Genetix, Great Britain) is automated system to pick up cell colonies or viral plaques from solid nutrient medium into both 96- and 384-well plates or into other special device. The usage special software allows you to carry out the directional selection colonies with the certain parameters including size, color and shape. Replaceable adapters with needles for 96- or 348-well formats are applied for picking up. QPix2 is indispensable system for the preparation classical genome libraries and the following Sanger sequencing analysis based on capillary electrophoresis.

Semiconductor sequencing

lpgi04
lpgi05

The next-generation sequencer Ion Torrent PGM (Life Technologies, USA) is the platform based on semiconductor sequencing technology for high-throughput sequencing of clonally amplified DNA fragments. The total output of PGM is from 40 million to 1,2 billion bp and depends on type of chips and the length of reads (200 or 400 bp). It is from 40 million to 1,2 billion bp. The manufacturer has a lot of variants of sample preparation that allows you to use PGM for bacterial genome resequencing and de novo sequencing, metagenomic sequencing, transcriptome analysis and epigenetic studies with high efficacy. The additional manufacturer’s equipment set including the emulsion amplifier Ion One Touch and the device for emulsion break-up and enrichment Ion OneTouch ES automates and speeds up the sample preparation process considerably.

Sequencing by ligation

SOLiD 4
SOLiD 4

The Genetic Analyzer SOLiD 4(Applied Biosystems, USA) is a unique platform based on sequencing by ligation for the high-throughput sequencing of clonally amplified DNA fragments, which are immobilized on magnetic beads. The system possesses a high flexibility owing of the presence of two independent flow cells and multiplexing capability of analysis from one to 96 samples. The total output of SOLiD is more than 100 billion bp per run. A unique 2-base encoding underlying the given technology enables you not only to obtain system accuracy greater than 99.94%, but also to discriminate between sequence errors and real SNPs in samples. The manufacturer has a lot of variants of sample preparation that allows you to use SOLiD 4 for prokaryotic and eukaryotic genome resequencing, de novo sequencing of bacterial genomes, metagenomic sequencing, transcriptome analysis and epigenetic studies with high efficacy.

Sequencing by ligation, sample preparation

SOLiD™  EZ Bead™ Emulsifier
SOLiD™ EZ Bead™ Emulsifier

The manufacturer’s set of additional equipment SOLiD™ EZ Bead™ System consisting of SOLiD™ EZ Bead™ Emulsifier, the device for emulsion formation; SOLiD™ EZ Bead™ Amplifier, a special thermocycler; SOLiD™ EZ Bead™ Enricher, the device for emulsion breaking and enrichment, provides a perfect results reproducibility and underlies the relatively low costs of system exploitation. Using SOLiD™ EZ Bead™ System, you raise considerably the output of the genetic platform and reduce the cost of the execution of work for high throughput parallel sequencing.
 

SOLiD™ EZ Bead™ Amplifier
SOLiD™ EZ Bead™ Amplifier

SOLiD™ EZ Bead™ Enricher
SOLiD™ EZ Bead™ Enricher

Pyrosequencing

lpgi10
The Next-generation sequencer GS FLX+ (Roche Diagnostics, USA) is the platform based on pyrosequencing for the high-throughput sequencing of clonally amplified DNA fragments. The total output of GS FLX+ is up to 1 billion bp with read lengths up to 1 kb. The sample preparation approach offered by manufacturer allows you to apply GS FLX+ with the maximum effectiveness for resequencing and de novo sequencing of bacterial genomes, amplicon sequencing, as well as metagenomic analysis, transcriptome profiling and epigenetic research when you deal with small samples of DNA.
lpgi11lpgi12
 

Equipment: chip hybridization technology

High-density chips based on BeadArray technology

lpgi13The robotic system TECAN Freedom Evo (Illumina, USA) is used to work with high-density biochips of Illumina. The device does not require a constant presence of men. It deposits the samples onto microarrays and carries out the following procedures with them in standalone mode. The duty cycle for handling of 24 biochips takes 4 hours.

lpgi14The device iScan (Illumina, USA) is intended for scanning of high-density biochips, Illumina’s production, allowing to carry out screening studies in the field of transcriptomics , epigenetics , whole genome or locus genotyping. The instrument has a system of exciting dual-laser excitation with wavelengths 532 and 658 nm , and its scanning resolution is 540 nm. Automatic image analysis allows to receive data without delay for statistical processing.
The BeadArray technology is based on microscopic silicon beads deposited on the prepared glass surface. The size of beads is 3 mm and each of them has specific oligonucleotide probes. Modern biochip lines Infinium HD allow you to analyze simultaneously up to 5 million SNPs in 12 samples per one chip.

Scientific projects

Genome sequencing of Aholeplasma laidlawii

Aholeplasma laidlawii, a species belonging to the class Mollicutes, has specific features that considerably distinguish it from other mollicutes. We were the first to determine the complete genome sequence of Aholeplasma laidlawii PG8, which contains 1,496,992 bp, and to annotate it. The complete proteome was also obtained, of which 521 proteins were identified (44% of all predicted). The Aholeplasma laidlawii PG8’s complete proteome first received served as a basis for further work on proteomic and genomic profiling of the microorganism, allowing us to verify the genome annotation and to validate the activity of organism-specific metabolic pathways. It is noteworthy that the genome of Aholeplasma laidlawii PG8 is the first bacterial whole genome that was determined alone by Russian scientists.

Complete genome and proteome of Acholeplasma laidlawii. Lazarev VN, Levitskii SA, Basovskii YI, Chukin MM, Akopian TA, Vereshchagin VV, Kostrjukova ES, Kovaleva GY, Kazanov MD, Malko DB, Vitreschak AG, Sernova NV, Gelfand MS, Demina IA, Serebryakova MV, Galyamina MA, Vtyurin NN, Rogov SI, Alexeev DG, Ladygina VG, Govorun VM.
J Bacteriol. 2011 Sep;193(18):4943-53.

Genome sequencing of Spiroplasma melliferum

Structural organization and function of spiroplasmas are of permanent interest due to economic losses some spiroplasmas cause in agriculture. Three species of spiroplasmas, such as S. citri, S. kunkelii и S. phoeniceum, affect agricultural plants.

Spiroplasma melliferum is a bee pathogen which may be implicated in mass deaths of honey bees. The understanding of mechanisms underlying the interaction between spiroplasmas and host organisms, and identification of bacterial virulence factors are an important challenge. We succeeded in determining the partial genome sequence of S. melliferum KC3. As well, the annotation was carried out. This genome portion (1,260,174 bp) comprises 88 % of the whole genome size predicted from physical maps. We also obtained the S. melliferum KC3’s whole proteome. 521 proteins were identified. This is 44 % of the predicted ORFs. Finally, we reported, for the first time, the metabolome of this bacterium.

lpgi15Sequencing data obtained were used to investigate mobile elements in the S. melliferum KC3’s genome. The mobile elements were shown to include (to contain) genes for phage proteins, for fragments or entire sequences of transposases, and for other proteins that are characteristic of the family of plectoviruses and are the virulence factors. It is noteworthy that such constructions have high potential for horizontal gene transfer.

Application of Spiroplasma melliferum proteogenomic profiling for the discovery of virulence factors and pathogenicity mechanisms in host-associated spiroplasmas. Alexeev D, Kostrjukova E, Aliper A, Popenko A, Bazaleev N, Tyakht A, Selezneva O, Akopian T, Prichodko E, Kondratov I, Chukin M, Demina I, Galyamina M, Kamashev D, Vanyushkina A, Ladygina V, Levitskii S, Lazarev V, Govorun V.
J Proteome Res. 2012 Jan 1;11(1):224-36.

Metagenome of the natural impoundment Krotovaya Lyaga

in collaboration with ICG SB RAS.

lpgi16Krotovaya Lyaga is a natural impoundment which attracts great interest of researchers due to abnormally high degradation of polluting biopolymers. We aimed to provide metagenomic analysis of microbial population colonizing the bottom. At first we were to develop a method for DNA extraction from sediment samples. Based on the analysis of the prepared DNA sample, we built a genomic library containing fragments of 2,000 bp. By now, 8,000 reads were obtained by Sanger sequencing based on capillary electrophoresis.

Also, a fragment library was prepared and sequenced by the Applied Biosystems SOLiD™ 4 System genetic platform using 1,5 slides.

Our data were sent to IC&G of SD of RAS for further analysis. The genome of the sediment sample exhibited a great number and a great diversity of genes coding different enzymes of metabolic pathways for degradation of biopolymers and, among them, cellulases.

Computer analysis of metagenomic data-prediction of quantitative value of specific activity of proteins. Ivanisenko VA, Demenkov PS, Pintus SS, Ivanisenko TV, Podkolodny NL, Ivanisenko LN, Rozanov AS, Bryanskaya AV, Kostrjukova ES, Levizkiy SA, Selezneva OV, Chukin MM, Larin AK, Kondratov IG, Lazarev VN, Peltek SE, Govorun VM, Kolchanov NA. Dokl Biochem Biophys, 2012; 443, №2, 1-5 pp.

«Russian Metagenome»

within the Russian metagenomic consortium(www.metagenome.ru)

Normal flora, or microbiota, in humans is of general biological significance. It is considered that intestinal microbiocenosis is a highly organized system that exhibits qualitative and quantitative shifts in response to the dynamic state of the body under different activity, health and disease conditions.
In the course of carrying out the project, the deep sequencing was conducted using the Applied Biosystems SOLiD™ 4 System genetic platform to explore more than 150 fecal samples collected from healthy individuals from different areas of the Russian Federation, including both large cities (Moscow, Saint Petersburg, Rostov-na-Donu, Saratov, Novosibirsk) and rural areas (the Omsk, Tyva, Khakassia and Tatarstan regions). Nucleotide sequences from each sample were mapped onto the known genomes of intestinal bacteria, and then the samples were compared against each other, through statistical methods, in both taxonomic units and functional gene groups. To check for distinctive features of the Russian metagenome composition in a global context, we performed a comparative analysis using existing data for adult population of Western Europe (Denmark, n = 85) and North America (USA, n = 137), and for native population of South America (Venezuela, n = 10) and Africa (Malawi, n = 5).

In whole, the results of our research correspond to those of analogous studies in America and Europe. 61of 86 microbial genera presented in the database were found at least in one of the studied samples.

The quantitative dominance was observed for members of the genera Prevotella, Bacteroides, Faecalibacterium, Roseburia, Lachnospiraceae, Coprococcus, Blautia, and Ruminococcus. Altogether they make up more than 80% of the total relative abundance in all the Russian samples.

At the same time, the Russian cohort is distinguished by the presence of unusual dominant microbial combinations, which were not earlier reported. These clusters include members of the genera Prevotella, Lachnospiraceae, Coprococcus, Faecalibacterium, Roseburia, Ruminococcus, and Bifidobacterium. Noteworthy is that these bacterial genera are found in inhabitants of outer rural areas, while they are not common to urban inhabitants, whose qualitative and quantitative composition of microbiota is similar with that in the population of Europe.

The similarity between the cluster samples varied depending on the choice of distance metric: the bacterial proportions were similar according to Spearman’s correlation (0.93±0.07, mean± SD), but the UniFrac distance between the samples was quite high (0.04±0.03). This dependence on the distance metric applied may be due to the substantial compositional similarity associated with significant differences at the quantitative level, this situation which is characteristic of family members as well as of closed populations.

Malina - a web-based tool for visual analytics of human gut microbiota whole-genome metagenomic reads. Alexander V Tyakht, Anna S Popenko, Maxim S Belenikin, Ilya A Altukhov, Alexander V Pavlenko, Elena S Kostryukova, Oksana V Selezneva, Andrei V Larin, Irina Y Karpova, Dmitry G Alexeev. Source Code Biol Med. (2012) 7(1):13.

Transcriptome profiling of the moss Physcomitrella patens.

in collaboration with IBC RAS

lpgi17The research is devoted to system analysis of the peptide genesis in plant cells. One of the important steps in systems analysis of the peptide genesis in plant cells is the analysis of gene transcription, including genes that encode peptide precursor proteins. In the course of carrying out the project, the transriptome analysis was conducted for the gametophores, protonema and protoplasts of the moss Physcomitrella patens.

Sequencing was performed using the Applied Biosystems SOLiDTM 4 System genetic platform. Samples were analyzed in several replicates. Both biological and technical replicates were performed in duplicate for gametophores. Three biological replicates and two technical replicates were done for protonema and protoplasts. As a result, we obtained 173, 197, or 204 million reads for the samples of gametophores, protonema and protoplasts, respectively. The Spearman correlation coefficient between the gene expression values detected by PCR and RNA-seq was 0.7, 0.7 and 0.8 for gametophores, protonema and protoplasts, respectively.

The expression of precursor proteins was compared with other genes. Gene expression levels in all three states were found to be lognormally distributed, with precursor proteins being highly expressed. This was confirmed by the statistical Mann-Whitney-Wilcoxon test: W = 263423.5 with a p-value < 2.2e-16 for gametophores, W = 447475.5 with a p-value < 2.2e-16 for protonema, and W = 9291323 with a p-value < 2.2e-16 for protoplasts.

Noteworthy also is a decreased total level of gene expression in protoplasts (W = 64461020, p-value < 2.2e-16).  The results obtained for gene transcription in different tissues of the moss P. patens, together with the proteome and peptidome data, may be used to generate system models for the processes of peptide genesis.

New candidate markers for colorectal cancer, as determined by whole genome methylation

lpgi18

DNA methylation is an important mechanism for regulating the genetic apparatus of cells in normal and pathological states. Using Human Methylation 450 biochip for the Illumina iScan Plus system, we examined the methylation status of 485577 DNA sites in 22 samples of rectal adenocarcinoma and 22 bioptates of the normal rectal epithelium from the same patients.

To generate the initial panel of candidate markers, we selected differentially methylated CpG sites that satisfied simultaneously the following three criteria: (1) a difference between the average methylation levels in the groups is greater than 40 % with FDR p-values < 0.05; (2) a variance of the methylation levels in the norm group is less 25 %; (3) the absence of overlapping of the extreme values of methylation in the compared groups (Information Gain = 1).

After the above described algorithm was applied, the preliminary panel included 14 CpG sites. The training of a classification model based on linear regression and the selection of additional attributes were performed on the initial panel with RapidMiner version 5.1. Three most informative methylation sites in the genes ADHFE1(cg01588438), COL4A1 (cg27546237) and C1orf70 (cg15487867) were  included in the final panel. The validation of the model was then performed on the methylation data presented by the Cancer Genome Atlas consortium, namely 247 samples including 205 samples of adenocarcinoma and 42 samples of normal tissue. The results of the independent validation test showed a diagnostic sensitivity of 98.09 % and a diagnostic specificity of 97.37 %. The fact of the association between the gene C1orf70 (cg15487867) and colorectal cancer was established for the first time. Such association for the genes ADHFE1(cg01588438) and COL4A1 (cg27546237) was reported earlier, but these genes were not analyzed as diagnostic candidate markers. The results obtained from the classification model, which were verified on the independent panel, suggest the diagnostic potential of the markers selected.

Selected publications

Selected publications

Human gut microbiota compositions found across urban and rural populations of Russia. Tyakht AV, Kostryukova ES, Popenko AS, Belenikin MS, Pavlenko AS, Larin AK, Karpova IY, Selezneva OV, Semashko TA, Ospanova EA, Babenko VV, Maev IV, Cheremushkin SV, Kucheryavy YA, Shcherbakov PL, Grinevich VB, Efimov OI, Sas EI, Abdulkhakov RA, Abdulkhakov SA, Lyalyukova EA, Livzan EA, Vlassov VV, Sagdeev RZ, Tsukanov VV, Osipenko MF, Kozlova IV, Tkachev AV, Sergienko VI, Alexeev DG., Govorun V.M., Nature Communications 2013, doi:10.1038/ncomms3469

Draft Genome of the Nitrogen-Fixing Bacterium Pseudomonas stutzeri Strain KOS6 Isolated from Industrial Hydrocarbon Sludge. Grigoryeva TV, Laikov AV, Naumova RP, Manolov AI, Larin AK, Karpova IY, Semashko TA, Alexeev DG, Kostryukova ES, Muller R, Govorun VM. Genome Announc. 2013 Jan;1(1). doi:pii: e00072-12. 10.1128/genomeA.00072-12. Epub 2013 Jan 31.

MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads. Alexander V Tyakht, Anna S Popenko, Maxim S Belenikin, Ilya A Altukhov, Alexander V Pavlenko, Elena S Kostryukova, Oksana V Selezneva, Andrei K Larin, Irina Y Karpova and Dmitry G Alexeev, Source Code for Biology and Medicine 2012, 7:13 doi:10.1186/1751-0473-7-13

Chromosome 18 Transcriptome Profiling and Targeted Proteome Mapping in Depleted Plasma, Liver Tissue and HepG2 Cells. Zgoda, Victor G., Arthur T. Kopylov, Olga V. Tikhonova, Alexander A. Moisa, Nadezhda V. Pyndyk, Tatyana E. Farafonova, Svetlana E. Novikova, Andrey V. Lisitsa, Elena A. Ponomarenko, Ekaterina V. Poverennaya, Sergey P. Radko, Svetlana A. Khmeleva, Leonid K. Kurbatov, Aleksey D. Filimonov, Nadezhda A. Bogolyubova, Ekaterina V. Ilgisonis, Aleksey L. Chernobrovkin, Alexis S. Ivanov, Alexei E. Medvedev, Yury V. Mezentsev, Sergei A. Moshkovskii, Stanislav N. Naryzhny, Elena N. Ilina, Elena S. Kostryukova, Dmitry G. Alexeev, Alexander V. Tyakht, Vadim M. Govorun, and Alexander I. Archakov. Journal of proteome research 12, no. 1 (2012): 123-134.

The application of Spiroplasma melliferum proteogenomic profiling for the discovery of virulence factors and pathogenicity mechanisms in host-associated spiroplasmasDmitry Alexeev , Elena Kostrjukova , Alexander Aliper , Anna Popenko , Nikolay Bazaleev , Alexander Tyakht , Oksana Selezneva , Tatyana Akopian , Elena Prichodko , Ilya Kondratov , Mikhail Chukin , Irina Demina , Maria Galyamina , Dmitri Kamashev , Anna Vanyushkina , Valentina Ladygina , Sergei Levitskii , Vasily Lazarev and Vadim Govorun, J. Proteome Res., DOI: 10.1021/pr2008626 Publication Date (Web): November 30, 2011

Complete genome and proteome of Acholeplasma laidlawii. Lazarev VN, Levitskii SA, Basovskii YI, Chukin MM, Akopian TA, Vereshchagin VV, Kostrjukova ES, Kovaleva GY, Kazanov MD, Malko DB, Vitreschak AG, Sernova NV, Gelfand MS, Demina IA, Serebryakova MV, Galyamina MA, Vtyurin NN, Rogov SI, Alexeev DG, Ladygina VG, Govorun VM. J Bacteriol. 2011 Sep;193(18):4943-53. Epub 2011 Jul 22.

Functional divergence of Helicobacter pylori related to early gastric cancer. Momynaliev KT, Kashin SV, Chelysheva VV, Selezneva OV, Demina IA, Serebryakova MV, Alexeev D, Ivanisenko VA, Aman E, Govorun VM. J Proteome Res. 2010 Jan;9(1):254-67.