Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor FormattedVector for faster formatting of S3 objects #646

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dfalbel
Copy link
Contributor

@dfalbel dfalbel commented Nov 27, 2024

The variables pane uses the FormattedVector API to format vector elements to be displayed. One important assumption that we do when using this API is that it's lazy, in the sense that we create an iterator over the elements of a vector and we only pay the formatting price for the values we actually iterated.

This is true for atomic vectors, for which we iterate using R's C API eg, extracting values with REAL_ELT and formattinng them in Rust. However, this is not true for any vector that has a class (except factors), in this case we need to use the format function, and we were actually formatting the entire vector before being able to iterate over the formatted values.
This is very costly if we need for format very large vectors, as when we expand the data.frame in posit-dev/positron#3628 (comment)

This PR refactors the FormattedVector API, with the main change being:

  • We no longer format the entire vector upon construction of the FormattedVector
  • Instead, we'll only format the vector when creating the iter()
  • We provide a few different iterators: iter(), iter_n(), column_iter(), column_iter_n(). Those suffixed with n will only format the requested part of the vector.

As a consequence, now creating the iter() returns a Result, so we need to adapt in a few places.
Now, most of the times in the variables pane we'll use iter_n() to only format the first N elements, that are necessary
to create the display value in the variables pane.

@dfalbel dfalbel requested a review from lionel- November 27, 2024 15:33
@dfalbel dfalbel force-pushed the bugfix/refactor-formatted-vector branch from 15f6f34 to 897cc9f Compare November 27, 2024 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant