Only Masters students funded under the CIHR/MSFHR Bioinformatics training program have the opportunity to participate in 3 rotations in different labs. If you are funded by a Faculty member in the program, you are not requred to take the rotations. Below are some of the examples of rotations from the past:
A funded MITACS – Accelerate internship project is available.
A Vancouver company is seeking help to compile clinically relevant information on functional genomics and ways of presenting them to oncologists and other practitioners of cancer medicine.
The skills set required is clinical research experience in solid tumor oncology, literature data mining and some experience with analysis of gene expression microarray data is highly desirable. The company seeks to collaborate with a university-based research group (an Accelerate intern and their professor) that has expertise in genes and pathways involved in cancer (e.g., drug targets) as well as therapeutic approaches (e.g., chemotherapy, small molecule inhibitors and monoclonal antibody therapies) used in oncology.
This project has no wetlab component, focusing more on literature/data mining and connecting this to the kind of information that the company has from genomic analysis to what a physician wants to see. The company need to do come up with an automated way of extracting for each patient the most relevant drug-targetable pathways that based on our expression data appear to be in an active state.
The internship will be 4 months long, with the intern spending ~50% of the time at the company with the balance back at the university with the supervising professor. The intern’s supervising professor needs to consent to the intern spending their time on the project.
1) Title: Whole genome and whole transcriptome sequencing of prostate tumours.
Lab: Dr. Colin Collins, Vancouver Prostate Centre, Microarray Facility
Co-supervisor: HYPERLINK “http://www.cs.sfu.ca/%7Ecenk/”Cenk Sahinalp Computing Science, SFU
It is established that cancer arises through the accumulation of genetic and epigenetic abnormalities that result in aberrant expression of genes and gene fusions that deregulate critical cellular control pathways and checkpoints. Current therapy is aimed at killing tumour cells that express these aberrant gene products. However, tumour cells often react to therapeutic treatment by activating genes that confer a survival advantage, enabling them to become resistant to therapy, multiply, and metastasize to additional organs. As a result, after exhausting all available therapies, most patients ultimately die. Because treatment resistance is the underlying basis for most cancer deaths, it is the major focus of research in our laboratory. We have initiated a large-scale program to decipher the genome and transcriptome structures of some 200 prostate cancer specimens with the goal of understanding the molecular determinates of the lethal (metastatic) phenotype. Our Lab has access to a bank of >2000 consented tumours, including the world’s largest repository of pre- and post-treated prostate cancers, tumours stratified from low to high risk, matched primary and metastatic tumours, and treatment resistant cancers. Next-generation sequencing provides information of an unprecedented depth and scale. Our ongoing research is expected to identify novel biomarkers and therapeutic targets using Whole Genome Shotgun Sequencing (WGSS) and Whole Transcriptome Shotgun Sequencing WTSS data that will be validated in a larger tumour cohorts.
A number of computational algorithms and tools are currently under development to identify mutations, translocations, fusion genes, alternative/aberrant transcripts and copy number changes linked to disease progression and resistance to therapy. The student(s) will be presented with several opportunities to pursue within the current research projects:
Whole transcriptome analysis with the goal of identification of alternative/aberrant transcripts in prostate tumours.
Whole genome analysis with the goal of identification of point mutations, translocations, inversions, deletions, and insertions in prostate tumours. Projects 1 and 2 involve development/tuning of algorithms for detection of biomarkers for experimental validation and functional assessment by the laboratory team.
Integration of the whole genome and transcriptome data with the goal of screening for aberrations in genes that encode therapeutic targets. This project is expected to yield a valuable tool for guiding personalized therapy for cancer patients. The project involves management and visualizats akwef ion of diverse data sets and the development of advanced methods for querying public and in house databases.
The candidate will work closely with a team of bioinformaticians, research scientists, and pathologist at VPC and will be actively involved in on-going collaborations with the Centre for Translational and Applied Genomics, the British Columbia Cancer Agency’s Genome Sciences Center and Simon Fraser University. This is a unique opportunity for a motivated individual to make an important impact on prostate cancer and to bring about the advent of personalized medicine using leading edge experimental and bioinformatics technologies.
Evaluation: A final report and expected results listed for every project above will be used to evaluate the progress.
2) Title: Mechanisms of gene regulation on transcript level
Lab: Irmtraud Meyer, UBC Centre for High-Throughput Biology and Department of Computer Science, UBC
We know by now from many dedicated experiments that the expression of protein-coding genes in higher eukaryotes is regulated by a range of mechanisms, not only on DNA level (e.g. by transcription factors binding the genomic DNA), but also on RNA level. Whereas the regulation on DNA level has been extensively studied, comparatively little is known on how genes are regulated on RNA level, i.e. on pre-mRNA and on mRNA level.
My group has developed a range of new computational methods that allow us to detect and study potential mechanisms of gene regulation on RNA level. We are particularly interested in detecting and understanding novel mechanisms of gene regulation. The hard part of the project has already been completed, i.e. the source code of the new method already written. You would be the first one to do a comprehensive data analysis after which the project would be ready for publication.
My group has a high-performance computer cluster which allows us to study genome-wide data sets. Please don’t hesitate to email me at HYPERLINK “mailto:email@example.com” firstname.lastname@example.org in case you have any questions.
Very good Perl/Python programming skills and experience working with Unix. Willingness and, ideally, prior experience compiling data sets from existing data bases (e.g. Ensembl, NCBI etc). You have the chance of getting acquainted with R.
Digital documentation of the project, presentation in our Bioinformatics journal club.
* Book: Durbin et al., Biological sequence analysis, Cambridge University Press, 1998
[ISBN-13: 9780521629713], chapters on hidden Markov models and stochastic on text-free grammars.
* I.M. Meyer, Predicting novel RNA-RNA interactions, Current Opinion in Structural Biology 18(3):387-93. (2008)
* I.M. Meyer, I. Miklos, Statistical evidence for conserved, local secondary structure in the coding regions of eukaryotic mRNAs and pre-mRNAs, Nucleic Acids Research 33(19), 6338-6348. (2005)
* J.S. Pedersen*, I.M. Meyer*, R. Forsberg, P. Simmonds, J. Hein, A comparative method for finding and folding RNA secondary structures in protein-coding regions, Nucleic Acids Research 32, 4925-4936. 2004.
3) Title: Detection of residue-residue interactions by comparative analysis of protein sequences.
Supervisor: Art F. Y. Poon, BC Centre for Excellence in HIV/AIDS (SFU).
Co-supervisor: Frederic Pio, Molecular Biology & Biochemistry, SFU
A crucial area of bioinformatics research is the prediction of residue contacts and/or long-range interactions from the comparative analysis of protein-coding sequences. Literally dozens of investigators in different domains have proposed their own algorithm to detect co-evolving residues. Generally speaking, all these methods accomplish this aim by identifying correlated patterns of substitution, such that a substitution at one site in a protein tends to be followed by another substitution at a second site. None of these methods, however, enables the user to specify an arbitrary pattern of residue interactions to simulate; as a result, there are no benchmark data by which to measure algorithm performance. Also, none accounts for the effect of recombination, which tends to break up the signal of correlated substitutions. In this project, the student will (1) implement simulations of protein-coding sequence evolution parameterized by a residue interaction graph and a phylogeny, and (2) evaluate the performance of several algorithms with and without recombination. Ideally, the student should be acquainted with either Python or C/C++; supplemental training will be provided if necessary.
Evaluation: Evaluation of the project will be based on the quality (i.e. effciency, precision, and maintainability) of the simulation code. Documentation of incomplete code will be more important than completion of the project.
4) Title: In silico docking and development of novel drug leads
Lab: Dr. Art Cherkasov- Urological Sciences, UBC (Jack Bell Centre, VGH) (in collaboration with Children & Women Research Institute and Prostate Centre)
This cheminformatics-related project will involve ‘in silico’ discovery of novel drug leads with potentially high affinity toward various human targets including the Sex Hormone Binding Globulin (SHBG), Cortisol Binding Globulin (CBG), Androgen Receptor (AR) and Insulin Growth Factor Binding Globulin (IGFBP) among others. These proteins represent important drug targets and the development of effective inhibitors for them represents very prospective and important task.
The successful candidate for the project will learn basic ‘in silico’ drug design skills involving protein homology modeling, virtual docking, pharmacophore construction and search, organization and manipulation of drug databases, structure-activity modeling and ‘in silico’ toxicity assessment of drug leads. A candidate will use a variety of in-house, public and commercial drug design programs (including 4 docking programs), a database of more than 50 million chemical substances, and will also be involved into development of novel docking scoring functions and docking protocols.
5) Title: Development of a repository for flow cytometry data
Supervisor: Dr. Ryan Brinkman, BC Cancer Research Centre
For more than 30 years, the fluorescence-based technique of flow cytometry has been widely used by clinicians and researchers to distinguish different cell types in mixed cell sub-populations, based on the expression of cellular markers. The flow cytometry data file standard (FCS) provides the specifications used to communicate raw measurements produced by flow cytometers. However, data captured by FCS files does not include annotation that would allow for independent interpretation of the experiment. Recently, the flow cytometry community has agreed on the Minimum Information about a Flow Cytometry experiment (MIFlowCyt) standard specifying the minimum annotation of flow cytometry data that shall be provided along with the publication of flow cytometry experiments to ensure that these experiments can be understood by other experts in the field. However, at this point, no software tools exist to ease the process of creating such an annotation and there are is no pubic data repository that would capture annotated flow cytometry data. Therefore, as was previously the case in other genomics technologies, there is now a need for a public, community-based, flow cytometry data repository.
The student will aid in the development of an open source flow cytometry data repository to enable storage and sharing of flow cytometry data sets and analyses descriptions associated with peer-reviewed publications. The repository will be designed around a web-based interface that will allow researches to submit and annotate their data based on the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt) standard. This work will be done in the collaboration with other members of the Brinkman group already working on the project.
Co-op student involvement
The student will be part of a small team dedicated to the development of the repository.
There are a variety of tasks that needs to be completed in order to design and implement the web-based application. These include authentication mechanisms, data upload mechanisms, persistence (database) mechanisms, web-pages design (using Java Server Pages and Servlets), implementation of sever business logic, etc. Student’s personal preferences and experience will be taken into account when assigning specific tasks.
Expected learning experience
During the rotation the student will learn about flow cytometry as well as gain experience with software design and development. He/she will get hands on various Java 2 Enterprise Edition technologies and learn how to use professional software development environment, including Eclipse, SVN source code version control system, JUnit test framework, Checkstyle, Findbugs, JDepend, Bugzilla and others.
6) Adapting the BLAST Algorithm for a PlayStation 3 Cluster
Lab: Steven Hallam, Microbiology & Immunology, UBC Graduate Program in Bioinformatics
Environmental genomics, also known as metagenomics, applies methods of DNA library production, high-throughput sequencing and in silico analysis to answer questions related to abundance, diversity and metabolic potential in naturally occurring microbial populations. Metagenomic datasets encompass the inherent genetic diversity of the sample under study and consequently the metabolic potential of the community as a whole. Such complexity gives rise to analytic bottlenecks related to gene identification and clustering that require high performance computing methods. One specific bottleneck relates to the use of local sequence alignment algorithms such as BLAST on complex data sets. In exploring options to increase the processing capabilities of our cluster to align genomic and proteomic data sets, we have looked into the possibility of constructing a cluster of PlayStation 3 gaming consoles. Szalkowski et al. (2008) recently implemented the Smith-Waterman local alignment algorithm on a cluster of PlayStation 3s with respectable performance1. With the successes of their implementation in mind, we would like to adapt the BLAST algorithm to fully utilize the potential of our planned PlayStation 3 cluster and promote the genesis of novel approaches requiring greater computational power for analyzing environmental genomic data sets.
Evaluation: A final report and a working cluster will be used to evaluate the project.
Szalkowski , A. et al., SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and ×86/SSE2, BMC Research Notes, 2008, I:107 doi:10.1186/1756-0500-1-107