This book presents the numerical methods for Bioinformatics study as following:several numerical substitutions represent the DNA and protein sequences as the numerical or graphical sequences;Fourier and wavelet transforms are used in gene identification and protein comparison;based on the data set of microarray or lipidomics with huge amount experiment data,suitable statistical and algebraic methods are used in studying biological features of two sets;under the representations of feature vectors,the classifications of two biological molecules are completed by clustering analysis;the models of differential and difference equations have been constructed which represent the biological dynamic process;and the techniques of missing data inputting have helped to estimate the missing entries coming from biological observations and experiments,and so on.In addition,some biological concepts involved by these methods are introduced also.
样章试读
暂时还没有任何用户评论
全部咨询(共0条问答)
暂时还没有任何用户咨询内容
目录
PREFACE CHAPTER 1 SOME BIOLOGICAL CONCEPTS 1.1 Cell 1.2 Genetic Material:DNA,Gene and RNA 1.2.1 DNA 1.2.2 Gene 1.2.3 RNA 1.3 Protein and Amino Acids 1.4 Chromosome 1.5 Omics 1.5.1 Genomics 1.5.2 Microarray 1.5.3 Proteomics 1.5.4 Lipidomics REFERENCES CHAPTER 2 GRAPHICAL REPRESENTATIONS OF DNA SEQUENCE 2.1 Three-Dimension(3-D)Graphical Representation 2.2 2-D Graphical Representation 2.3 2-D Graphical Representations Without Degeneracy 2.4 Used a 1-D Numerical Representation of four Nucleotides to Construct a 2-D Graphical Representation of the DNA Sequence REFERENCES CHAPTER 3 NUMERICAL REPRESENTATIONS OF DNA SEQUENCE 3.1 4-D and 3-D Numerical Representations of a DNA Sequence 3.2 2-D Numerical Representations of a DNA Sequence 3.3 The Complex Numerical Representation 3.4 1-D Numerical Representations of four Nucleotides and 2-D GraphicalRepresentation of a DNA Sequence 3.5 The Representations of Feature Vector,Genome Space and Matrix Representation of DNA Sequence 3.6 The Numerical Representation Based on Physical,Chemical and Structural Properties of DNA Sequence 3.6.1 The numerical representations based on some attribute equivalences of nucleotides 3.6.2 The representation of DNA by the inspiration from codon and the idea of three attribute equivalences 3.6.3 EIIP numerical representation for nucleotides REFERENCES CHAPTER 4 NUMERICAL REPRESENTATIONS OF PROTEIN 4.1 1-D Numerical and Graphical Representations of the Amino Acid Sequence 4.2 2-D Numerical and Graphical Representations of the Amino Acid Sequence 4.3 A 2-D Graphical Representation and Moment Vector Representation of Protein 4.4 3-D Numerical Representation of Protein 4.5 The 10-D Representation of an Amino Acid 4.6 The Vector and Matrix Representations of Protein Sequence and Protein Space 4.7 Other Schemes of the Representation for Protein REFERENCES CHAPTER 5 PRACTICAL ORTHOGONAL TRANSFORM 5.1 Some Features and Algorithms for the Discrete Fourier Transform 5.1.1 Fourier transforms of the original sequence and its subsequence 5.1.2 The independency of the Fourier transforms at several frequencies 5.1.3 The Fourier transform of symbolic sequence 5.1.4 Fourier transform of binary sequence 5.1.5 Several algorithms of Fourier transform 5.1.6 The properties of Fourier transform of real sequence 5.2 Wavelet Analysis 5.2.1 Introduction 5.2.2 Multiresolution analysis of a function by Haar scaling and wavelet function 5.2.3 Construction of wavelet systems 5.2.4 Mallet transform REFERENCES CHAPTER 6 IDENTIFYING PROTEIN-CODING REGIONS(EXONS)BYNUCLEOTIDE DISTRIBUTIONS 6.1 Portein Coding Regions Finding in DNA Sequence 6.1.1 Introduction 6.1.2 The stochastic simulation and several computing formulae 6.1.3 FEND algorithm,predicting protein coding regions from nucleotide distributions on the three positions of a DNA sequence 6.1.4 Performance evaluation of FEND algorithm 6.2 The Experiment for Distinguishing Exon and Intron Sequences by a Threshold 6.2.1 Motivation 6.2.2 Idea of distinguishing exon and intron sequences 6.2.3 Results and discussion REFERENCES CHAPTER 7 PROTEIN COMPARISON BY ORTHOGONAL TRANSFORMS 7.1 Protein Comparison by Discrete Fourier Transformation(DFT) 7.1.1 EIIP representation of protein sequence 7.1.2 Symmetry of discrete Fourier transform of real sequence 7.1.3 Cross-spectral function 7.2 Protein Comparison by Discrete Wavelet Transformation 7.2.1 Several techniques needed for DWT method 7.2.2 The performance of the DWT method REFERENCES CHAPTER 8 THE APPLICATION OF VECTOR REPRESENTATIONSTO BIOLOGICAL MOLECULE ANALYSIS 8.1 Use Feature Vector to Analyze DNA Sequences 8.1.1 Feature vector representation of DNA sequence 8.1.2 Comparing DNA sequences 8.2 A Protein Map and its Applications 8.2.1 Recalling a 2-D graphical representation and moment vector representation of protein 8.2.2 Protein map and cluster analysis 8.3 An Appendix:Introduction to Cluster Analysis REFERENCES CHAPTER 9 THE STATISTICS ANALYSIS OF LARGE AMOUNT OF EXPERIMENTAL DATA 9.1 A Way to Process Microarray Data 9.1.1 Data form 9.1.2 Microarray data set 9.1.3 Preliminary filtering 9.1.4 Assessing normalization 9.1.5 Hypothesis test 9.1.6 Conclusion 9.2 The Statistical Analysis of a Set of Lipidomics Data 9.2.1 Introduction 9.2.2 Statistical techniques of initial data processing 9.2.3 Initial data arrangement 9.2.4 Hypothesis testing analysis REFERENCES CHAPTER 10 APPLY SINGULAR VALUE DECOMPOSITION TO MICROARRAY ANALYSIS 10.1 SVD,PCA and GSVD 10.1.1 Singular value decomposition 10.1.2 Principal component analysis 10.1.3 Generalized singular value decomposition 10.2 Apply SVD/PCA to Microarray Analysis 10.3 GSVD Analyzes the Microarray Data REFERENCES CHAPTER 11 DYNAMICAL ANALYSIS MODELS OF GENE EXPRESSIO 11.1 Differential Equations Model of Gene Expression 11.1.1 Transcription model 11.1.2 Nonlinear dynamic equations 11.1.3 Linearization of the nonlinear transcription model 11.1.4 Approximating coefficient matrix M by Fourier series 11.1.5 Solution to transcription matrix C and V 11.2 Modified Linear Differential Equations Model 11.3 Dynamical Model Based on Singular Value Decomposition 11.3.1 Introduction 11.3.2 Reducing gene´s number 11.3.3 The approach based on singular value decomposition(SVD) 11.3.4 The methods of solving dynamical models REFERENCES CHAPTER 12 MISSING MICROARRAY DATA INPUTTING 12.1 The Ad Hoc Methods 12.2 Missing Data Inputting Based on SVD 12.2.1 A new way for missing data inputting 12.2.2 Other method based on SVD 12.3 Weighted K-Nearest Neighbors,KNN,Impute Algorithm 12.4 Estimation of Missing Values in Microarray Data Based on the Least Square Principle 12.4.1 Least squares estimate of the unknown variable 12.4.2 The least square estimation of missing data based on genes 12.4.3 The least square estimation of missing data based on arrays 12.4.4 Combining the gene and array based estimates 12.5.1 Selecting genes 12.5.2 Gene-wise formulation of local least squares imputation 12.5 Local Least Square Inputting(LLSinpute) 12.6 The Comparison of the Methods of Missing Data Inputting REFERENCES PLATE