Yue, Alice - Feature-based Comparison of Flow Cytometry Data...

View the thesis

This thesis has been submitted to the Library for purposes of graduation, but needs to be audited for technical details related to publication in order to be approved for inclusion in the Library collection.
Term: 
Summer 2017
Degree: 
M.Sc.
Degree type: 
Thesis
Department: 
School of Computing Science
Faculty: 
Applied Sciences
Senior supervisor: 
Cedric Chauve
Co-supervisor, if any: 
Ryan Brinkman
Thesis title: 
Feature-based Comparison of Flow Cytometry Data
Given Names: 
Alice
Surname: 
Yue
Abstract: 
Flow cytometry (FCM) bioinformatics is a sub-field of bioinformatics, aimed at developing effective and efficient computational tools to store, organize, and analyze high-throughput/dimensional FCM data. Flow cytometers are capable of analyzing thousands of cells per second for up to 40 features. These features primarily signal the presence of different proteins on cells in the bloodstream. Hence contributing large amounts of data towards the big biological data paradigm. The data that a flow cytometer outputs from a biological sample, is called a FCS file. The International Mouse Phenotyping Consortium (IMPC) is a collaboration between 23 international institutions and funding organizations. Its aim is to decipher the function of 20,000 mouse genes. IMPC is doing so by breeding mice with a certain gene knocked out (KO), cancelling the function of that gene. In turn, FCM is used to measure the immunological changes correlated to this knockout. Many tools exist to classify FCS files. However, there is a lack of tools to conduct unsupervised clustering of FCS files. One goal of IMPC is to compare and contrast KO genes, hence IMPC becomes a prime motivation for this problem. As such, this thesis outlines a data processing pipeline used to isolate features for each FCS file. We then test the different types of features extracted on a benchmark data set from the FlowCAP-II challenge, containing data from healthy persons and patients with AML (acute myeloid leukemia). We then evaluate how well these features separate out FCS files of different origin (i.e. healthy vs AML).
Keywords: 
Bioinformatics; Flow Cytometry; Feature Design
Total pages: 
173