-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
examples: determine effective kon #713
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
|
||
Rate of binding | ||
=============== | ||
|
||
.. only:: html | ||
|
||
:nbexport:`Download this page as a Jupyter notebook <self>` | ||
|
||
.. _kon: | ||
|
||
Determine the rate of binding | ||
----------------------------- | ||
|
||
In this Notebook, we will determine the binding time of a fluorescently labeled protein binding to DNA. | ||
The protein binds and unbinds to target sites on DNA and the result is recorded as a kymograph. | ||
We track the binding events, and then determine the time intervals *between* the binding events: | ||
|
||
.. image:: kon.png | ||
|
||
These time intervals tell you how long it takes for a protein to bind to an empy target site. | ||
|
||
The binding time, :math:`\tau_{on}` relates to the on rate of protein, :math:`k_{on}`, as :math:`\tau_{on}=1/ (k_{on}[P])` . | ||
The binding rate :math:`k_{on}` relates to the dissociation constant as. | ||
|
||
.. math:: | ||
|
||
K_{off} = \frac{k_{off}}{k_{on}} | ||
|
||
For this example, we don't know the protein concentration and can therefore not determine :math:`k_{on}` . | ||
We will determine the binding time and refer to the inverse of the binding time, as the *effective binding rate*, :math:`k'_{on} = k_{on}[P]` . | ||
|
||
Load and plot the kymographs | ||
---------------------------- | ||
|
||
The kymograph and corresponding tracks that are used in this tutorial are stored on zenodo.org. | ||
The following line of code downloads the data and stores the data in the folder `"test_data"`:: | ||
|
||
filenames = lk.download_from_doi("10.5281/zenodo.14198300", "test_data") | ||
|
||
Load and plot the kymograph:: | ||
|
||
file1 = lk.File("test_data/kymo1.h5") | ||
_, kymo1 = file1.kymos.popitem() | ||
|
||
plt.figure() | ||
kymo1.plot("g", aspect = 5, adjustment=lk.ColorAdjustment([0], [5])) | ||
|
||
.. image:: kymo1.png | ||
|
||
Load the tracks | ||
--------------- | ||
|
||
For this tutorial, the binding events have already been tracked in Pylake. | ||
Load the tracks as follows:: | ||
|
||
tracks1 = lk.load_tracks("test_data/tracks1.csv", kymo1.crop_by_distance(4.9,13.8), "green") | ||
|
||
Note that the kymograph passed to `lk.load_tracks` is cropped, because tracking was performed on a cropped kymograph, see :ref:`tracking`. | ||
|
||
Use the same approach as above to load the tracks exported from Lakeview, except that the part :func:`Kymo.crop_by_distance() <lumicks.pylake.kymo.Kymo.crop_by_distance>` has to be removed. | ||
|
||
Select target location | ||
---------------------- | ||
|
||
Plot the tracks:: | ||
|
||
plt.figure() | ||
tracks1.plot() | ||
plt.show() | ||
|
||
.. image:: tracks1.png | ||
|
||
Next, select the coordinates of the target binding site, for which you would like to determine the on-rate. | ||
Often, the location of a target site is identified using, for example, fluorescent markers. | ||
On this kymograph, all binding events were on a target sequence. So we can select the target locations manually. | ||
|
||
First, we select the following region:: | ||
|
||
plt.figure() | ||
tracks1.plot() | ||
plt.hlines(y=8.4, xmin=0,xmax=320) | ||
plt.hlines(y=9, xmin=0, xmax=320) | ||
|
||
.. image:: track_selection1.png | ||
|
||
Select all tracks that are on average within the two coordinates indicated in the above image:: | ||
|
||
track_selection1 = tracks1[[8.4 < np.mean(track.position) < 9 for track in tracks1]] | ||
aafkevandenberg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Plot the final selection of tracks:: | ||
|
||
plt.figure(figsize = (9,1)) | ||
track_selection1.plot() | ||
|
||
.. image:: track_selection1_plot.png | ||
|
||
Since we are using a repeat sequence and all observed binding events were on-target, we select multiple regions on the same kymograph:: | ||
|
||
coordinates = [(8.4,9),(7,7.6),(6.2,6.8),(5.5,6.1),(4.8,5.4),(4.1,4.7),(3.3,3.9),(2.6,3.2),(1.9,2.5),(1.2,1.8),(0.5,1.1)] | ||
|
||
Using the above coordinates, we can select the corresponding region from the kymograph, and compute the time intervals between the tracked binding events:: | ||
|
||
def time_intervals(tracks): | ||
"""Compute the time intervals between all tracks in a given selection""" | ||
intervals = [tracks[x+1].seconds[0]-tracks[x].seconds[-1] for x in range(len(tracks)-1)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You rely on tracks being sorted with respect to time and non-overlapping here. Unfortunately, tracks in a group are not guaranteed to be sorted in time. While this is usually the case for I would sort them and throw if you encounter overlaps. The second thing you rely on is that this is a |
||
return intervals | ||
|
||
intervals_total = [] | ||
|
||
for coordinate in coordinates: | ||
bot, top = coordinate | ||
track_selection = tracks1[[bot < np.mean(track.position) < top for track in tracks1]] | ||
intervals = time_intervals(track_selection) | ||
intervals_total += intervals | ||
aafkevandenberg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
All the time intervals between binding events are stored in the list `intervals_total`. Check how many intervals we have in total:: | ||
|
||
>>> len(intervals_total) | ||
46 | ||
|
||
Determine kon | ||
------------- | ||
|
||
Binding times are typically exponentially distributed. The distribution can be expressed in terms of the effective on-rate, :math:`k'_{on}`, or in terms of the binding lifetime, :math:`\tau_{on}`: | ||
|
||
.. math:: | ||
|
||
P(t) = k'_{on}e^{-k'_{on}t} = \frac{1}{\tau_{on}} e^{-t/\tau_{on}} | ||
|
||
Fit an exponential ditribution to the distribution of time intervals using Pylake:: | ||
|
||
single_exponential_fit = lk.DwelltimeModel(np.array(intervals_total), n_components=1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would make sense to give some thought on discretization and the minimally observable time. Discretization is easy, that's just the line time. The latter I would expect to be related to the gap window size in some way since you can't really observe on times shorter than this (because they would be connected to the adjacent track). For this data, it isn't really needed, given that the binding times you measure here are quite long (36 seconds). In that case, I don't expect it to change the answer much in this case (since your line time is much shorter), but I think it would make the notebook a bit more generally usable for people who have short on timescales. What do you think? |
||
|
||
plt.figure() | ||
single_exponential_fit.hist() | ||
plt.show() | ||
|
||
.. image:: hist_fit.png | ||
|
||
The fitted binding time is 36 seconds, which is equivalent to an effective rate :math:`k'_{on} = 1/36 = 0.028 s^{-1}`. | ||
|
||
The confidence intervals can be determined using Bootstrapping:: | ||
|
||
bootstrap = single_exponential_fit.calculate_bootstrap(iterations=10000) | ||
|
||
plt.figure() | ||
bootstrap.hist(alpha=0.05) | ||
plt.show() | ||
|
||
.. image:: bootstrap.png | ||
|
||
Conclusion and Outlook | ||
---------------------- | ||
|
||
The binding time is 36 seconds with a 95% confidence interval of (24,50). | ||
|
||
As mentioned in the introduction, the obtained binding time depends on the protein concentration. | ||
Since we don't know the protein concentration, this value can only be compared to measurements with the same protein concentration in the flow cell. | ||
If you would like to compute the dissociation constant and compare to bulk experiments, the concentration has to be determined [1]_. | ||
|
||
.. [1] Schaich *et al*, Single-molecule analysis of DNA-binding proteins from nuclear extracts (SMADNE), NAR (2023) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this one, and also the other binding notebook actually, we should probably mention that we neglect photobleaching here (which would lengthen
t_on
tau_on
and shortent_off
tau_off
). Users should make sure that their samples aren't expected to bleach on the timescales that they are considering.One paper that might be interesting to look at in the future would be this one, though it needs a very large number of counts (I wouldn't suggest that for this analysis right now, as I doubt this amount of data would work reliably there). There are more pragmatic ways to correct for bleaching too, but given that we haven't tested those, I would not explicitly recommend a particular one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll add a note that this tutorial assumes that bleaching time >> binding time. This is what we advise for these type of experiments.
And correction to your note: t_on shortens_, while t_off lengthens due to photobleaching
Looking at the paper, t_on, corresponds to the binding lifetime (so 1/koff), in our experiments. Equation 17 is a common approach to include the bleaching time (some authors used it, including me) and I was about to incorporate it in a tutorial.
It would be great to include the bleaching time as an optional parameter in the Pylake functions for fitting exponential distributions. Often, users don't know the bleaching time, as a permanently bound dye is required to measure this quantity, but it can eg be measured from a dye stuck on the surface.
I have never seen the approach to determine t_off while accounting for bleaching, but it makes sense that t_off lengthens when accounting for bleaching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant tau, but you're right of course, I'll edit the original message to reflect that.
For now, I think adding that note and mentioning that bleaching isn't taken into account is sufficient.
And nice! Makes sense! Looking forward to that tutorial.