Tutorials¶
- Biopython
- Introduction
- Quick Start
- Sequence Objects
- Sequences and Alphabets
- Sequences act like strings
- Slicing a sequence
- Turning Seq objects into strings
- Concatenating or adding sequences
- Changing case
- Nucleotide sequences and (reverse) complements
- Transcription
- Translation
- Translation Tables
- Comparing Seq objects
- MutableSeq Objects
- UnknownSeq Objects
- Working with strings directly
- Sequence annotation objects
- Sequence Input/Output
- Parsing or Reading Sequences
- Reading Sequence Files
- Parsing sequences from compressed files
- Parsing sequences from the net
- Sequence files as dictionaries
- Writing sequence files
- Multiple Sequence Alignment objects
- BLAST
- BLAST and other sequence search tools (experimental code)
- Accessing NCBI’s Entrez databases
- Entrez Guidelines
- EInfo: Obtaining information about the Entrez databases
- ESearch: Searching the Entrez databases
- EPost: Uploading a list of identifiers
- ESummary: Retrieving summaries from primary IDs
- EFetch: Downloading full records from Entrez
- ELink: Searching for related items in NCBI Entrez
- EGQuery: Global Query - counts for search terms
- ESpell: Obtaining spelling suggestions
- Parsing huge Entrez XML files
- Handling errors
- Specialized parsers
- Using a proxy
- Examples
- Using the history and WebEnv
- Swiss-Prot and ExPASy
- Going 3D: The PDB module
- Reading and writing crystal structure files
- Structure representation
- Disorder
- Hetero residues
- Navigating through a Structure object
- Analyzing structures
- Measuring distances
- Measuring angles
- Measuring torsion angles
- Determining atom-atom contacts
- Superimposing two structures
- Mapping the residues of two related structures onto each other
- Calculating the Half Sphere Exposure
- Determining the secondary structure
- Calculating the residue depth[subsec:residue_depth]
- Common problems in PDB files
- Accessing the Protein Data Bank
- Bio.PopGen: Population genetics
- Phylogenetics with Bio.Phylo
- Sequence motif analysis using Bio.motifs
- Cluster analysis
- Data representation
- Missing values
- Random number generator
- Euclidean distance
- City-block distance
- The Pearson correlation coefficient
- Absolute Pearson correlation
- Uncentered correlation (cosine of the angle)
- Absolute uncentered correlation
- Spearman rank correlation
- Kendall’s \(\tau\)
- Weighting
- Calculating the distance matrix
- Calculating the cluster centroids
- Calculating the distance between clusters
- \(k\)-means and \(k\)-medians
- \(k\)-medoids clustering
- Representing a hierarchical clustering solution
- Performing hierarchical clustering
- Calculating the distance matrix
- Calculating the cluster centroids
- Calculating the distance between clusters
- Performing hierarchical clustering
- Performing \(k\)-means or \(k\)-medians clustering
- Calculating a Self-Organizing Map
- Saving the clustering result
- Supervised learning methods
- Graphics including GenomeDiagram
- KEGG
- Cookbook – Cool things to do with it
- Working with sequence files
- Filtering a sequence file
- Producing randomised genomes
- Translating a FASTA file of CDS entries
- Making the sequences in a FASTA file upper case
- Sorting a sequence file
- Simple quality filtering for FASTQ files
- Trimming off primer sequences
- Trimming off adaptor sequences
- Converting FASTQ files
- Converting FASTA and QUAL files into FASTQ files
- Indexing a FASTQ file
- Converting SFF files
- Identifying open reading frames
- Sequence parsing plus simple plots
- Dealing with alignments
- Working with sequence files
- The Biopython testing framework
- Advanced
- Where to go from here – contributing to Biopython
- Appendix: Useful stuff about Python
- About the contents
- References