Tutorials¶

Biopython
- Tutorial and Cookbook
Introduction
- What is Biopython?
- What can I find in the Biopython package
- About these notebooks
Quick Start
- General overview of what Biopython provides
- Working with sequences
- A usage example
- Parsing sequence file formats
- Connecting with biological databases
- What to do next
Sequence Objects
- Sequences and Alphabets
- Sequences act like strings
- Slicing a sequence
- Turning Seq objects into strings
- Concatenating or adding sequences
- Changing case
- Nucleotide sequences and (reverse) complements
- Transcription
- Translation
- Translation Tables
- Comparing Seq objects
- MutableSeq Objects
- UnknownSeq Objects
- Working with strings directly
Sequence annotation objects
- The SeqRecord Object
- Creating a SeqRecord
- Feature, location and position objects
- References
- The format method
- Slicing a SeqRecord
- Adding SeqRecord objects
- Reverse-complementing SeqRecord objects
Sequence Input/Output
- Parsing or Reading Sequences
- Reading Sequence Files
- Parsing sequences from compressed files
- Parsing sequences from the net
  - Parsing GenBank records from the net
  - Parsing SwissProt sequences from the net
- Sequence files as dictionaries
- Writing sequence files
Multiple Sequence Alignment objects
- Parsing or Reading Sequence Alignments
- Writing Alignments
  - Converting between sequence alignment file formats
  - Getting your alignment objects as formatted strings
- Manipulating Alignments
  - Slicing alignments
  - Alignments as arrays
- Alignment Tools
BLAST
- Running BLAST over the Internet
- Saving blast output
- Running BLAST locally
- Parsing BLAST output
- The BLAST record class
- Deprecated BLAST parsers
- Bio.Blast.NCBIStandalone
BLAST and other sequence search tools (experimental code)
- The SearchIO object model
  - QueryResult
  - Hit
  - HSP
  - HSPFragment
- A note about standards and conventions
- Reading search output files
- Dealing with large search output files with indexing
- Writing and converting search output files
Accessing NCBI’s Entrez databases
- Entrez Guidelines
- EInfo: Obtaining information about the Entrez databases
- ESearch: Searching the Entrez databases
- EPost: Uploading a list of identifiers
- ESummary: Retrieving summaries from primary IDs
- EFetch: Downloading full records from Entrez
- ELink: Searching for related items in NCBI Entrez
- EGQuery: Global Query - counts for search terms
- ESpell: Obtaining spelling suggestions
- Parsing huge Entrez XML files
- Handling errors
- Specialized parsers
- Using a proxy
- Examples
- Using the history and WebEnv
Swiss-Prot and ExPASy
- Parsing Swiss-Prot files
  - Parsing Swiss-Prot records
  - Parsing the Swiss-Prot keyword and category list
- Parsing Prosite records
- Parsing Prosite documentation records
- Parsing Enzyme records
- Accessing the ExPASy server
- Scanning the Prosite database
Going 3D: The PDB module
- Reading and writing crystal structure files
- Structure representation
- Disorder
- Hetero residues
- Navigating through a Structure object
- Analyzing structures
- Common problems in PDB files
- Accessing the Protein Data Bank
Bio.PopGen: Population genetics
- GenePop
- Operations on GenePop records
- Coalescent simulation
  - Creating scenarios
    - Demography
    - Chromosome structure
  - Running SIMCOAL2
Phylogenetics with Bio.Phylo
- Demo: what is in a tree?
  - Coloring branches within a tree
- I/O functions
- View and export trees
- Using Tree and Clade objects
- Running external applications
- PAML integration
Sequence motif analysis using Bio.motifs
- Motif objects
  - Creating a motif from instances
  - Creating a sequence logo
- Reading motifs
- Writing motifs
- Position-Weight Matrices
- Position-Specific Scoring Matrices
- Searching for instances
- Each motif object has an associated Position-Specific Scoring Matrix
- Comparing motifs
- De novo motif finding
  - MEME
  - AlignAce
Cluster analysis
- Data representation
- Missing values
- Random number generator
  - Distance functions
- Euclidean distance
- City-block distance
- The Pearson correlation coefficient
- Absolute Pearson correlation
- Uncentered correlation (cosine of the angle)
- Absolute uncentered correlation
- Spearman rank correlation
- Kendall’s \(\tau\)
- Weighting
- Calculating the distance matrix
  - Calculating cluster properties
- Calculating the cluster centroids
- Calculating the distance between clusters
  - Partitioning algorithms
- \(k\)-means and \(k\)-medians
- \(k\)-medoids clustering
  - Hierarchical clustering
- Representing a hierarchical clustering solution
- Performing hierarchical clustering
  - Self-Organizing Maps
  - Principal Component Analysis
  - Handling Cluster/TreeView-type files
- Calculating the distance matrix
- Calculating the cluster centroids
- Calculating the distance between clusters
- Performing hierarchical clustering
- Performing \(k\)-means or \(k\)-medians clustering
- Calculating a Self-Organizing Map
- Saving the clustering result
  - Example calculation
Supervised learning methods
- The Logistic Regression Model
- \(k\)-Nearest Neighbors
Graphics including GenomeDiagram
- GenomeDiagram
- Chromosomes
  - Simple Chromosomes
KEGG
- Parsing KEGG records
- Querying the KEGG API
Cookbook – Cool things to do with it
- Working with sequence files
- Sequence parsing plus simple plots
- Dealing with alignments
The Biopython testing framework
- Running the tests
  - Running the tests using Tox
- Writing tests
  - Writing a print-and-compare test
  - Writing a unittest-based test
- Writing doctests
Advanced
- Parser Design
- Substitution Matrices
  - SubsMat
- FreqTable
Where to go from here – contributing to Biopython
- Bug Reports + Feature Requests
- Mailing lists and helping newcomers
- Contributing Documentation
- Contributing cookbook examples
- Maintaining a distribution for a platform
Appendix: Useful stuff about Python
- What the heck is a handle?
  - Creating a handle from a string
About the contents
- Authorship
References

Read the Docs v: latest

Versions: latest

Downloads: pdf; htmlzip; epub

On Read the Docs: Project Home; Builds

Free document hosting provided by Read the Docs.