Y.H.A Leurs1, W. van den Hout1, A. Gardin1, J.L.J. van Dongen, J.C.M. van Hest* Francesca Grisoni*, L. Brunsveld*
1These authors contributed equally to this work.
*Corresponding authors: [email protected], [email protected], [email protected].
This repo is not the final version an can be subjected to changes.
Biomolecular condensates are essential functional cellular structures that form through phase separation of macromolecules such as proteins and RNA. Synthetic condensates have recently gathered great interest as they can be engineered to better understand the formation mechanism of these cellular condensates and serve as cell-mimetic platforms to develop novel therapeutic strategies. The complexity of the biomolecular components and their reciprocal interactions, however, makes precise engineering and systematic characterization of condensate formation a challenging endeavor. While constructing phase diagrams is a systematic approach to gain comprehensive insight into phase separation behavior, it is a time-consuming and labor-intensive process. Here, we present an automated platform for efficiently mapping multi-dimensional phase diagrams of condensates. The automated platform incorporates a pipetting system for sample formulation, and an autonomous confocal microscope for particle property analysis and characterization. Active machine learning – which allows iterative model improvement – is used to learn from previous experiments and steer future experiments towards an efficient exploration of phase boundaries. The versatility of the pipeline is demonstrated by showcasing its ability to rapidly explore the phase behavior of various polypeptides of opposite charge across formulations, producing detailed and reproducible multidimensional phase diagrams. Beyond identifying phase boundaries, the platform also provides information-rich data, enabling quantification of key condensate properties such as particle size, count, and volume fraction – adding functional insights to phase diagrams. This self-driven platform is robust and generalizable, allowing easy extension to any given combination of condensate-forming materials, ultimately providing key insights into their formation and characteristics.This repository contains the code used to apply the active machine learning pipeline described in the main paper and depiceted in the Figure above, section III, panel H, I, J.
This repository is structured in the following way:
experiments/
: folder containing the experiments completed by our platform and some test cases.figures/
: folder containig high resolution figure as reported in the main paper.robotexperiments/
: main folder containig the.py
files defining the package.script/
: folder containig scripts for setting up (experiments/
) and running (cycles/
) the experiments, and for plotting results (plots/
).environment.yml
: the environment file to isntall the package.setup.py
: the file for installing the package.
conda env create -f env.yaml
The installation requires another custom python package, stated in the environment.yml
file and available at this link, ActiveLearningCLassiFier.
Automated navigation of condensate phase behavior with active machine learning.
Yannick Leurs, Willem van den Hout, Andrea Gardin, Joost van Dongen, Jan van Hest, Francesca Grisoni, Luc Brunsveld
ChemRxiv, 04 December, 2024.
DOI: https://doi.org/10.26434/chemrxiv-2024-frnj3