Íîâîñòè
Êîíòàêòû
Ñõåìà ïðîåçäà:
ÈÌÁ
ÔÁèÁè, ÌÃÓ
Ñòàòüÿ î ñåìèíàðå |
Êðàòêèå ðåçþìå äîêëàäîâ
|
|
2012-14
2009-11
2006-08
2003-05
2000-02
1997-99
1994-96 |
|
|
|
21.
16.08.1994 |
|
I.B.Rogozin, Luciano Milanesi*, Nicolay A. Kolchanov
Institute of Cytology and Genetics SD RAS, Russia
and
* Istituto di Technologie Biomediche Avanzate, Consiglio
Nazionale Della Ricerche, Milano, Italy
Gene structure prediction using information on homologous
protein sequence
A new approach for protein coding genes structure
prediction is suggested. The principal scheme of prediction is as
follows. At first, the best potential exons are predicted in a sequence
with unknown functions through revealing potential splice sites
and regions with high coding potential. List of potential amino
acid fragments encoded by these exons is formed. The next step is
testing the homology between each amino acid fragment from the list
and proteins from the SWISS-PROT database of amino acid sequences.
The only sequence with the best identity score is chosen out of
all the homologous proteins. The third step is reconstruction of
the exon-intron structure based on the data regarding the homology
of protein sequences. Testing of this method on the independent
control set (20 genes) has shown the accuracy of exon/intron structure
prediction is comparable with Grail. 21% of real exons was lost
and 3% of nonreal exons was found.
|
22.
05.09.1994 |
|
Eugene Kolker, Edward N. Trifonov
Department of Structural Biology, The Weizmann Institute
of Science
Modular structure of protein sequences and its
possible origins
Analysis of the protein sequence length distributions
showed that ~20% of proteins are made of standard size units of
~123 amino acids for eukaryotes and ~152 amino acids for prokaryotes.
This underlying regularity is approximately twice stronger on more
conservative proteins such as enzymes and proteins with a subunit
structure.
Among other possible reasons of such protein sequence
organization, the recombinational origin was proposed. One could
think that in early evolution DNA segments of the standard size
were shuffled between themselves. If that was the case, the initiation
triplets (methionine residues) have to have preferences to positions
corresponding to the multiples of the unit size. Our analysis of
eukaryotic sequences confirms this hypothesis.
|
23.
22.09.1994 |
|
V.Makeev
Institute of Molecular Biology
Usage of different amino acid similarity scores
in Fourier analysis of protein sequences. Comparison of periodical
patterns in the genic and protein sequences of collageng
Fourier transform of the autocorrelation function
of a sequence permits an accurate computation of the amino acid
similarity. Moreover, it is feasible to study periodical patterns
composed only by amino acids of some particular type, e.g. charged
or hydrophobic. Calculations of periodical patterns in the primary
structure of collagen using different similarity matrices show that
the periodical structures found in collagen originate from distribution
of amino acids of different types. Nevertheless, the comparative
analysis of periodical patterns in the protein sequence and the
sequence of the gene coding for this protein shows that at least
some patterns originate from the gene sequence and seem to arise
via gene duplications. [Makeev et al., 1995].
|
24.
1.12.1994 |
|
M.Gelfand
Statistical aspects of forensic DNA analysis (an
overview)
I'll present a not-too-deep overview of various
forensic DNA techniques including, in particular, analysis of variable
number of tandem repeats polymorphisms (DNA fingerprinting), analysis
of hypervariable loci in mitochondrial DNA, and applications of
phylogenetic analysis. Some well known cases will be considered,
namely, the “Florida dentist case” (who has transmited AIDS to a
number of his patients), identification of the remains of the Romanov
family, and, to some extent, the O.J.Simpson case (a former football
player who is under trial in USA for murdering his wife).
|
25.
15.12.1994 |
|
Sh.R.Sunyaev, V.G.Tumanyan, E.N.Kuznetsov
Institute of Molecular Biology
Statistical approach to the inverse folding problem
· The inverse folding problem.
· Brief overview of the current situation
in the field.
· The problem of description of the 3D structure.
· Classification of the existing protein
data bank.
· Our approach: statistical criteria and
similarity functionals used.
· The alignment problem in the considered
case.
· The probabilistic model and the problem
of the decision rule.
· Estimation of influence of certain amino
acids.
[Ñþíÿåâ è äð., 1994; Sunyaev et al., 1997; Sunyaev,
1997].
|
26.
29.12.1994 |
|
A.A.Mironov, I.V.Grigoriev
Contacts of alpha helices and the peptide architecture
The Protein Data Base has been analysed. The rules
for major contacts of alpha helices are formulated. It is shown
that hydrophobicity of a contact is a wrong criterion for selection
of major contacts. [Ãðèãîðüåâ è äð., 1997á; Grigoriev et al., 1998].
|
27.
19.1.1995 |
|
A.M.Leontovich
What a weight matrix for the alignment should be?
Various approaches for the choice of the matrix
of weights for changes of residues in biosequences are discussed.
Under one of them some theorems are proved on optimality of the
Dayhoff matrix. The other more pragmatic approach provides the "normality"
of the weight matrix. Problem of penalties for gaps will also be
discussed.
|
30.
1.3.1995 |
|
V.V.Panjukov
Institute of Mathematical Problems of Biology
Finding steady alignments: Similarity and distance
Some aligments keep the optimum when the weight
parameters vary over a range of values. Aligments of this kind are
called steady. A method for finding all steady optimal aligments
of two sequences will be presented. It assumes that the gap penalty
is directly proportional to the gap length.
Previously it has been shown that if the weight
one insertion/deletion is <0.5, the similarity-based and distance-based
alignments are not equivalent. An explanation for this fact will
be given.
[Panjukov, 1993].
|
32.
17.5.1995 |
|
M.Gelfand, A.Mironov, P.Pevzner
Spliced alignment: A new approach to gene recognition
The standard way of utilizing the information about
homologous proteins in exon assembly, which is to predict several
candidate exon-intron structures and then to submit them to similarity
search, has several obvious drawbacks. An alterative is provided
by the spliced alignment approach. We consider a set of probable
splicing sites or a set of candidate exons and apply an effective
procedure that simultaneously aligns all structures generated by
this set with a target protein sequence.
The program implementing the spliced alignment algorithm
correctly predicts all human genes from the testing set if a mammalian
relative is known. More distanced targets provide less perfect,
but still very good level of recognition. Several seeming errors
proved to be results of alternative splicing or errors in GenBank
feature tables. The results on simulated data demonstrate that the
quality of prediction with strongly mutated targets crucially depends
on the quality of filtering of candidate exons. [Gelfand et al.,
1996b; Gelfand et al., 1996c; Mironov et al., 1998; Mironov et al.,
1999b; Mironov et al., 2000].
|
33.
31.5.1995 |
|
I.Dedinsky
Institute of Biomedical Chemistry
Prediction of B-epitopes using grammar parsing
B-epitope (antigenic site) is modeled as a sequence
site having stabilized (rigid) conformation. Stabilization is provided
by interaction of amino acids within the epitope. We introduce the
notion of epitope structure and construct a set of empirical rules
of structure formation. Then these rules are transformed into a
context-sensitive grammar. Parsing by this grammar recognizes antigenic
and non-antigenic sites. As opposed to existing algorithms for epitope
prediction that construct profiles of amino acids along th sequence,
the developed method is more sensitive to the amino acid constext,
provides less ambiguous results, and is in general more reliable.
This work leaded to a problem of formal construction
of a grammar system based on a sample of positive and negative examples
on sequences. In order to do that we introduce a distributive similarity
operation on sequences allowing us to form generalized images of
example sets.
|
34.
1.5.1995 |
|
I.Dedinsky
Institute of Biomedical Chemistry
Similarity between biological sequences from the
point of view of mathematics and biology
1. Pairwise alignment of biological sequences
taking into account structural information.
2. Distributive operation of consensus construction
without alignment for biological sequences.
|
36.
13.07.1995 |
|
Leonid Mirny
Harvard University, Dept. of Chemistry
How protein may fold...
We study the thermodynamic and kinetic behavior
of a simple model for protein folding. Different scenarios of folding
are observed for a chain on a cubic lattice. We simulate protein
folding experiments and compare observed kinetics with the data
obtained in recent experiments. Detailed study of protein kinetics
and thermodynamics reveals two physically different mechanisms providing
fast folding. The role of intermediates in protein folding is discussed.
|
37.
20.7.1995 |
|
Leonid Mirny
Harvard University, Dept. of Chemistry
Fold recognition and dynamics in the space of contact
maps
We introduce an energy function for contact maps
of proteins, that takes into account pairwise interactions between
amino acids as well as hydrophobic interactions of amino acids with
water. The hydrophobic energy term is of a form that prefers an
optimal number of inter-protein contacts, specific for each amino
acid. We derived parameters of the energy function from a statistical
analysis of the contact maps of known structures. The energy function
was tested in several ways. First, the sequences obtained by randomly
scrambling the amino acids of a protein were screened by calculating
for each the energy of the protein's known contact map. This test
demonstrated strong sequence specificity of the introduced energy
function. Next we simulated protein dynamics by performing Monte-Carlo
moves in the space of contact maps. Topological and polymeric constraints
were taken into account by dynamic rules that reduced the possible
allowed steps. In good agreement with expectations, the method identifies
a set of local minima in the vicinity of the native state. We simulated
melting of a protein by performing Monte-Carlo dynamics at a high
temperature. Slow cooling from a partially unfolded state refolds
a protein to conformations very similar to the native one. We also
performed fold recognition experiments, i.e. screening a set of
known structures against a given sequence. The results for the BPTI
and myoglobin sequences are presented. In both cases the energy
of the native structure lies significantly below the average value
for the set. Moreover, the myoglobin sequence is able to identify
structure of the other members of the globin family as having the
lowest energy values in the set. The method is also able to identify
incorrect folds of BPTI in the case when other currently used potentials
failed to achieve this. Perspectives of application of the method
for structure checking and fold recognition are discussed.
|
38.
12.9.1995 |
|
Ross Overbeek
Argonne National Laboratory
Interpreting microbial genomes
Two microbial genomes have already been completely
sequenced. Two more will be completed during the next few months.
I believe that there will be 10-15 complete genomes available within
18 months. What can be learned from these genomes? I propose to
discuss the central issues of how to reduce the cost of determining
functions for genes, for determination of operons, for analysis
of regulatory mechanisms, and how to acquire and organize the data
required to support this research.
|
40.
6.10.1995 |
|
M.Gelfand and M.Roytberg
New developments in recognition of coding regions
Since none of the gene recognition algorithms is
perfect, the developers have to keep the balance between over-and
underprediction. Usually some parameter more or less symmetrically
dependent on both these values (e.g. the correlation coefficient)
is optimized. However, there exist situations where there is no
symmetry, and errors of one type are much more serious than those
of the other type. We will consider two such situations.
(1) The number of candidate exons in a sequence
fragment is typically very large. On the other hand, many approaches
use algorithms polynomial (or even exponential) on this number.
Thus there arises the problem of preliminary filtration of the exon
set. Such procedure should have sensitivity close to 100% (lose
nothing), although the specificity can be rather low.
(2) On the other hand, sometimes it is important
to predict only fragments of a gene (not even complete exons), but
with a very high specificity. This problem arises, in particular,
in construction of oligonucleotide probes and PCR primers for screening
cDNA libraries given a genomic fragment.
We will present algorithms based on the vector dynamic
programming approach that address these problems.
[Ðîéòáåðã è äð., 1997; Roytberg et al., 1997; Mironov
et al., 1998; Sze et al., 1998]
|
41.
1.12.1995 |
|
N.N.Vtyurin
Institute of Molecular Genetics
Modeling of the spatial structure of protein molecules
· Types of protein architecture.
· Search for structural analogs of a protein
with known amino acid sequence.
· Technology of computer modeling.
· Mekler's constructions.
|
42-43.
15.12.1995, 22.12.1995 |
|
Sh.R.Sunyaev
Institute of Molecular Biology
Statistical approach to the inverse protein folding
problem. Criteria of 3D-1D compatibility.
1) Brief introduction. Reduced representations of the
protein tertiary structure.
2) Basic assumptions of our approach.
3) The problem of 3D-1D compatibility as
a problem of statistical hypothesis testing.
4) Some criteria of 3D-1D compatibility.
5) Requirements for environmental variables
used for reduced structure representations. Tests performed on a
representative set of proteins from PDB.
6) Can statistical approach help to invent
new environmental variables?
[Ñþíÿåâ è äð., 1995; Ñþíÿåâ è äð, 1996].
|
44.
26.1.1996 |
|
V.A.Shepelev
Institute of Molecular Genetics
Multidimensional dot-matrices
Dot-matrices of similarity are widely used for visualization
of similarity regions in a pair of nucleotide or amino acid sequences.
Generalization of the dot-matrix of homology for n sequences is
suggested. For the visualization of the n-dimensional dot-matrix,
a special projection which conserve the distances along the sequence
is displayed. The common regions of similarity are revealed as segments
of straight lines parallel to the main diagonal. An effective algorithm
of n-dimensional dot-matrix calculation is suggested. The method
is useful for visualization of similarity regions e.g. protein-coding
region, for a wide variety of sequences' families as illustrated
by a number of examples. Up to ten sequences 10 kb each can be analysed
with this program. Some further improvements of the program are
discussed. [Shepelev & Yanishevsky, 1994].
|
45.
9.2.1996 |
|
O.D.Ermolaeva
Institute of Bioorganic Chemistry
Mathematical model of subtractive hybridization
and its practical application
The first theory of subtractive hybridization is
developed. A kinetic model of this process is proposed and implemented
in a computer program modeling the subtraction process. A new method
of subtractive hybridization based on the theory allows one to perform
routine comparison of genomes and products of genome expression.
It is used in studies of the genetic mechanisms of embryogenesis,
regeneration, cell differentiation and tumor transformation. [Ermolaeva
& Wagner, 1995; Ermolaeva & Sverdlov, 1996; Ermolaeva et
al., 1996].
|
46.
23.2.1996 |
|
A.V.Prokhorov
Department of Mathematics, Moscow State University
Mathematical analysis of verse
1. Metric organization of speech.
2. Probabilistic models of the speech rhythm.
3. Mathematical analysis of verse.
|
47.
15.3.1996 |
|
G. Kutuzova
Institute of Molecular Biology
Artificial Neural Networks: some neural network
models and their applications in computer analysis of DNA and protein
sequences
Hopfield model, Kohonen model, Back-Propagation
model: architecture and topology, rules of weights modification,
learning algorithms. Applications to E. coli promoter recognition,
prediction of protein secondary structure, search for unusual motives
in DNA sequences. Comparison with traditional methods. [Êóòóçîâà
è Ïîëîçîâ, 1995].
|
48.
3.9.1996 |
|
M.Gelfand
Genetics of two-spotted ladybird Adalia bipunctata
(a review)
Although genetics of two-spotted ladybird is not
as widely studied as, say, genetics of Drosophila, its tradition
comes back to Dobzhansky and Timofeev-Ressovsky. This beetle has
some peculiar and interesting features. Its genome carries a large
number of recessive lethal mutations. It can be infected by a microbe
killing all male eggs, whereas the female offspring are again infected.
The most interesting and widely used phenomenon is the color polymorphism.
There exist at least 12 variants of the coloration controlled by
a single gene. In general, melanic (black with red spots) alleles
are dominant, and typical (red with black spots) are recessive.
The percentage of melanics in various populations is 0% through
80%. There exist various explanations for this phenomenon, ranging
from ecology and geography (melanics seem to be preferred in seashore
populations, large industrial cities, at the boundaries of the ladybird
areal) to purely genetical (it seems that there exists a gene responsible
for preference of melanic males by females). The history of the
polymorphism studies is quite dramatical, with sharp contradictions
between different groups, retractions and re-retractions, loss of
pure lines etc. Finally, the papers themselves, especially those
dedicated to sexual preferences of the ladies, are rather amusing.
|
49.
19.9.1996 |
|
A.A.Belyaev
Vernadsky Institute of Geochemistry and Analystical
Chemistry
Geochemical earthquake precursors
Long-term observations of the ground water composition
in several seismoactive regions allowed us to obtain a new geochemical
predictive indicator of erthquake preparation. The observed natural
phenomenon involves appearance of a regular sequence of specific
geochemical anomalies. Duration of the observed preseismic period
may exceed two years.
Serial regularity of such anomalies indicates that
there exists an oscillating force of changing (increasing) frequency
acting on the observed chemical system during this period. The discovered
effect formed a base for a earthquake prediction method which uses
the analysis of frequency modulated ascilllations (FMO) in the geochemical
system.
|
50.
3.10.1996 |
|
T.A.Borovina
Institute of Mathematical Problems of Biology
On the resolution of methods for calculation of
DNA redundancy
Three approaches for estimation of the DNA redundancy
are compared: the Shannon entropy, the Lempel-Ziv complexity, and
a new method, computation of the low frequency component of the
l-gram graph. Although these methods are based on different ideas,
they satisfy some reasonable requirements. The ability of these
methods to find various kinds of repeats is compared. [Êèñëþê è
äð., 1995].
|
51.
19.12.1996 |
|
Sh. Sunyaev
Institute of Molecular Biology
Statistical analysis of residue conformational
properties, or what makes knowledge-based protein fold prediction
so difficult?
Validity of the theoretical basis of currently used
knowledge-based techniques for protein fold recognition was investigated.
Three following points were considered:
i) Is it possible to introduce a probability
distribution for various conformational properties of amino acid
residues?
ii) How strong are statistical preferences
of amino acids to be in specific environment?
iii) How conservative are conformational
properties amongst proteins with the same folding type?
[Sunyaev et al., 1998].
|
|
|
|
|
|
|
|