Nature built proteins through a process of evolution by natural selection and our approach is to study the relation between protein sequence and function via this process. To this end, we combine high-throughput microfluidics biochemistry experiments, statistical data analysis and mathematical models inspired by statistical physics to generate and analyze controlled evolutonary trajectories of proteins. The outputs of these trajectories are then compared with models derived from natural protein sequences.
Our current experimental workflow consists in constructing libraries of millions of mutants of an enzyme, which we encapsulate one by one in monodisperse droplets. The proteins in each droplet are expressed, assayed for enzymatic function using fluorescent assays and sorted into bins that correspond to each level of enzymatic activity. Deep sequencing of genes that encode the proteins in each bin yields quantitative information on the relation between sequence (genotype) and function (phenotype). This large scale inferrence of genotype-phenotype mapping (or ’’fitness landscape’’) can be compared to models derived from natural enzyme sequence repositories.
We propose several lines of work on this theme for a Master internship/PhD thesis, ranging from purely theoretical to predominantly experimental, depending on the interests of the candidate. A background in quantitative sciences (physics, maths or engineering) is desirable but no prior knowledge in protein evolution or biology is required if the candidate has the motivation to learn these subjects.