Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A large speed up to extract_fin_year #96

Merged
merged 5 commits into from
Jun 20, 2023
Merged

Conversation

Moohan
Copy link
Member

@Moohan Moohan commented Jun 13, 2023

I was working on a similar function for one of my projects and realised the improvements I made there were also applicable here.

This is basically a full rewrite of the function but there is no change functionally (all the tests still pass).

This provides a speedup of 70X for a single date, and a 2X speedup for 10 million dates, scaling between those two numbers for other vector sizes! Importantly the changes also use 2.5-3X less memory.

> bench::press(
+   n = c(1, 1e3, 1e5, 1e7),
+   {
+     dates <- create_dates(n)
+     bench::mark(
+      original = extract_fin_year(dates),
+      new = extract_fin_year_new(dates),
+      relative = TRUE,
+      min_time = 3
+     )
+   }
+ ) %>% 
+   print() %>% 
+   ggplot2::autoplot()
Running with:
         n
1        1
2     1000
3   100000
4 10000000
# A tibble: 8 × 14
  expression        n   min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result    memory     time      
  <bch:expr>    <dbl> <dbl>  <dbl>     <dbl>     <dbl>    <dbl> <int> <dbl>   <bch:tm> <list>    <list>     <list>    
1 original          1 71.3   69.1       1       Inf        1      776    15      2.84s <chr [2]> <Rprofmem> <bench_tm>
2 new               1  1      1        71.1     NaN        1.47  9996     4    515.3ms <chr [2]> <Rprofmem> <bench_tm>
3 original       1000 19.0   19.3       1         3.16     2.43   747    14      2.85s <chr>     <Rprofmem> <bench_tm>
4 new            1000  1      1        19.2       1        1     9996     4      1.98s <chr>     <Rprofmem> <bench_tm>
5 original     100000  2.17   2.05      1         2.77     4.09   223     4      2.91s <chr>     <Rprofmem> <bench_tm>
6 new          100000  1      1         2.16      1        1      493     1      2.98s <chr>     <Rprofmem> <bench_tm>
7 original   10000000  1.94   1.97      1         2.52     1.32     3     8      3.42s <chr>     <Rprofmem> <bench_tm>
8 new        10000000  1      1         2.02      1        1        6     6      3.38s <chr>     <Rprofmem> <bench_tm>

image

@Moohan Moohan requested a review from Tina815 June 13, 2023 15:48
@Moohan Moohan added enhancement New feature or request Maintainance labels Jun 14, 2023
@Tina815
Copy link
Contributor

Tina815 commented Jun 20, 2023

Hi James, thanks very much for the improvement! I really like the clever way of doing this and made the method much more efficient.

@Tina815 Tina815 merged commit 4495b71 into development Jun 20, 2023
@Tina815 Tina815 deleted the speedup/extract_fin_year branch June 20, 2023 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Maintainance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants