18 Nov 02:59

wlandau

d6a696a

Memory efficiency Latest

Latest

targets 1.9.0

Improvements

Un-break workflows that use format = "file_fast" (#1339, @koefoeden).
Fix deadlock in error = "trim" (#1340, @koefoeden).
Remove tailored debugging message (#1341, @koefoeden).
Store warnings while writing to storage (#1345, @Aariq).
Allow garbage_collection to be a non-negative integer to control the frequency of garbage collection in a performant, convenient, unified way (#1351).
Deprecate the garbage_collection argument of tar_make(), tar_make_future(), and tar_make_clusterm() (#1351).
Instrument target_run(), target_prepare(), and target_conclude() using autometric.
Avoid sending problematic error classes such as "vctrs_error_subscript_oob" to rlang::abort() (#1354, @Jiefei-Wang).
Reduce memory consumption by ~23% in large pipelines by avoiding the accumulation of promise objects (#1352).
Avoid store_assert_format() and store_convert_object() is storage is "none".
Add a list() method to tar_repository_cas() to make it easier and more efficient to specify custom CAS repositories (#1366).
Improve speed and reduce memory consumption by avoiding deep copies of inner environments of target definition objects (#1368).
Reduce memory consumption by storing buds and branches as lightweight references when memory is "transient" (#1364).
Replace the memory class with the new lookup class.
Implement memory = "auto" to select transient memory for dynamic branches and persistent memory for other targets (#1371).
Omit whole pattern targets from branch subpipelines when possible. Should reduce memory consumption in some cases.
Omit whole stem targets from branch subpipelines when retrieval is "main" and only a bud is actually used. The same cannot be done with branches because each branch may need to be (un)marshaled individually.
Compress branches into references when retrieval is "worker" and the whole pattern is part of the subpipeline.
Avoid duplicated branch aggregation: just send the branches over the network.
Back-compatibly switch format = "qs" from qs to qs2 (#1373).
Add tar_unblock_process().

Potentially invalidating changes

Add "keepNA" and "keepInteger" to .deparseOpts() (#1375). This may cause existing pipelines to rerun, but it makes add-ons like tarchetypes::tar_map() much easier to use.

Contributors

Jiefei-Wang, Aariq, and koefoeden

Assets 2

02 Oct 17:41

wlandau

1.8.0

3695f06

Content addressable storage

targets 1.8.0

Wrap tar_watch() UI module in bslib::page() (#1302, @kwbyron-lilly).
Remove callr_function in tar_make_as_job() argument list.
Ensure storage = "worker" is respected when the process of storing an object generates an error (#1304, @multimeric).
Default to the _targets.R pattern in tar_branches() (#1306, @multimeric, @mattwarkentin).
Remove superfluous functions and globals from metadata with tar_prune() (#1312, @benzipperer).
Change the default workspace_on_error option to TRUE (#1310, @hadley).
Enhance and organize the error = "stop" error message.
Avoid saving a file in _targets/objects for error = "null". Instead, switch to a special "null" storage format class if error is "null" the target throws an error. This should allow users to more freely create new formats with tar_format() without worrying about how to handle NULL objects created by error = "null".
Implement format = "auto" (#1311, @hadley).
Replace pingr dependency with base::socketConnection() for local URL utilities (#1317, #1318, @Adafede).
Implement tar_repository_cas(), tar_repository_cas_local(), and tar_repository_cas_local_gc() for content-addressable storage (#1232, #1314, @noamross).
Add tar_format_get() to make implementing CAS systems easier.
Implement error = "trim" in tar_target() and tar_option_set() (#1310, #1311, @hadley).
Use the file system type to decide whether to trust time stamps (#1315, @hadley, @gaborcsardi).
Deprecate format = "file_fast" in favor of the above (#1315).
Deprecate trust_object_timestamps in favor of the more unified trust_timestamps in tar_option_set() (#1315).
Print storage size of each target in verbose reporters (#1337, @psychelzh).
Combine help files of tar_target() and tar_target_raw(). Same with tar_load() and tar_load_raw().
Add a substitute argument to tar_format() to make it easier to write custom storage formats without metaprogramming.

Contributors

hadley, noamross, and 7 other contributors

Assets 2

20 Jun 18:23

wlandau

1.7.1

42cb4c1

bslib and speed

targets 1.7.1

Use bslib in tar_watch().
Speed up target_upstream_edges() and pipeline_upstream_edges() by avoiding data frames until the last minute (17% speedup for certain kinds of large pipelines).
Automatically set as_job to FALSE in tar_make() if rstudioapi and/or RStudio is not available.

Assets 2

17 Apr 17:14

wlandau

1.7.0

669be66

secretbase

targets 1.7.0

Invalidating changes

Use secretbase::siphash13() instead of digest(algo = "xxhash64", serializationVersion = 3) so hashes of in-memory objects no longer depend on serialization version 3 headers (#1244, @shikokuchuo). Unfortunately, pipelines built with earlier versions of targets will need to rerun.

Other improvements

Ensure patterns marshal properly (#1266, #1264, njtierney/geotargets#52, @Aariq, @njtierney).
Inform and prompt the user when the pipeline was built with an old version of targets and changes to the package will cause the current work to rerun (#1244). For the tar_make*() functions, utils::menu() prompts the user to give people a chance to downgrade if necessary.
For type safety in the internal database class, read all columns as character vectors in data.table::fread(), then convert them to the correct types afterwards.
Add a new tar_resources_custom_format() function which can pass environment variables to customize the behavior of custom tar_format() storage formats (#1263, #1232, @Aariq, @noamross).
Only marshal dependencies if actually sending the target to a parallel worker.

Contributors

noamross, njtierney, and 2 other contributors

Assets 2

13 Mar 15:54

wlandau

1.6.0

83568e1

Custom descriptions

targets 1.6.0

Modernize extras in tar_renv().
tar_target() gains a description argument for free-form text describing what the target is about (#1230, #1235, #1236, @tjmahr).
tar_visnetwork(), tar_glimpse(), tar_network(), tar_mermaid(), and tar_manifest() now optionally show target descriptions (#1230, #1235, #1236, @tjmahr).
tar_described_as() is a new wrapper around tidyselect::any_of() to select specific subsets of targets based on the description rather than the name (#1136, #1196, @noamross, @mattmoo).
Fix the documentation of the names argument (nudge users toward tidyselect expressions).
Make assertions on the pipeline process more robust (to check if two processes are trying to access the same data store).

Contributors

noamross, tjmahr, and mattmoo

Assets 2

15 Feb 11:00

wlandau

1.5.1

e5c79bd

CRAN patch

targets 1.5.1

Avoid arrow-related CRAN check NOTE.
use_targets() only writes the _targets.R script. The run.sh and run.R scripts are superseded by the as_job argument of tar_make(). Users not using the RStudio IDE can call tar_make() with callr_function = callr::r_bg to run the pipeline as a background process. tar_make_clustermq() and tar_make_future() are superseded in favor tar_make(use_crwe = TRUE), so template files are no longer written for the former automatically.

Assets 2

08 Jan 16:35

wlandau

1.4.1

b61d207

Small fixes

targets 1.4.1

Print "errored pipeline" when at least one target errors.
Bump minimum clustermq version to 0.9.2.
Repair the tar_debug_instructions() tips for when commands are long.
Do not look for dependencies of primitive functions (#1200, @smwindecker, @joelnitta).

Contributors

smwindecker and joelnitta

Assets 2

11 Dec 16:25

wlandau

1.4.0

cc9a6a2

AWS/crew efficiency, random number safety

targets 1.4.0

Invalidating changes

Because of the changes below, upgrading to this version of targets will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.

Use SHA512 during the creation of target-specific pseudo-random number generator seeds (#1139). This change decreases the risk of overlapping/correlated random number generator streams. See the "RNG overlap" section of the tar_seed_create() help file for details and justification. Unfortunately, this change will invalidate all currently built targets because the seeds will be different. To avoid rerunning your whole pipeline, set cue = tar_cue(seed = FALSE) in tar_target().
For cloud storage: instead of the hash of the local file, use the ETag for AWS S3 targets and the MD5 hash for GCP GCS targets (#1172). Sanitize with targets:::digest_chr64() in both cases before storing the result in the metadata.
For a cloud target to be truly up to date, the hash in the metadata now needs to match the current object in the bucket, not the version recorded in the metadata (#1172). In other words, targets now tries to ensure that the up-to-date data objects in the cloud are in their newest versions. So if you roll back the metadata to an older version, you will still be able to access historical data versions with e.g. tar_read(), but the pipeline will no longer be up to date.

Other changes to seeds

Add a new exported function tar_seed_create() which creates target-specific pseudo-random number generator seeds.
Add an "RNG overlap" section in the tar_seed_create() help file to justify and defend how targets and tarchetypes approach pseudo-random numbers.
Add function tar_seed_set() which sets a seed and sets all the RNG algorithms to their defaults in the R installation of the user. Each target now uses tar_seed_set() function to set its seed before running its R command (#1139).
Deprecate tar_seed() in favor of the new tar_seed_get() function.

Other cloud storage improvements

For all cloud targets, check hashes in batched LIST requests instead of individual HEAD requests (#1172). Dramatically speeds up the process of checking if cloud targets are up to date.
For AWS S3 targets, tar_delete(), tar_destroy(), and tar_prune() now use efficient batched calls to delete_objects() instead of costly individual calls to delete_object() (#1171).
Add a new verbose argument to tar_delete(), tar_destroy(), and tar_prune().
Add a new batch_size argument to tar_delete(), tar_destroy(), and tar_prune().
Add new arguments page_size and verbose to tar_resources_aws() (#1172).
Add a new tar_unversion() function to remove version IDs from the metadata of cloud targets. This makes it easier to interact with just the current version of each target, as opposed to the version ID recorded in the local metadata.

Other improvements

Migrate to the changes in clustermq 0.9.0 (@mschubert).
In progress statuses, change "started" to "dispatched" and change "built" to "completed" (#1192).
Deprecate tar_started() in favor of tar_dispatched() (#1192).
Deprecate tar_built() in favor of tar_completed() (#1192).
Console messages from reporters say "dispatched" and "completed" instead of "started" and "built" (#1192).
The crew scheduling algorithm no longer waits on saturated controllers, and targets that are ready are greedily dispatched to crew even if all workers are busy (#1182, #1192). To appropriately set expectations for users, reporters print "dispatched (pending)" instead of "dispatched" if the task load is backlogged at the moment.
In the crew scheduling algorithm, waiting for tasks is now a truly event-driven process and consumes 5-10x less CPU resources (#1183). Only the auto-scaling of workers uses polling (with an inexpensive default polling interval of 0.5 seconds, configurable through seconds_interval in the controller).
Simplify stored target tracebacks.
Print the traceback on error.

Contributors

mschubert

Assets 2

11 Oct 19:53

wlandau

1.3.2

6831811

CRAN patch

targets 1.3.2

Try to fix function help files for CRAN.

Assets 2

11 Oct 19:12

wlandau

1.3.1

b2f4749

Cloud metadata fixes

targets 1.3.1

Add tar_config_projects() and tar_config_yaml() (#1153, @psychelzh).
Apply error modes to builder_wait_correct_hash() in target_conclude.tar_builder() (#1154, @gadenbuie).
Remove duplicated error message from builder_error_null().
Allow tar_meta_upload() and tar_meta_download() to avoid errors if one or more metadata files do not exist. Add a new argument strict to control error behavior.
Add new arguments meta, progress, process, and crew to control individual metadata files in tar_meta_upload(), tar_meta_download(), tar_meta_sync(), and tar_meta_delete().
Avoid newly deprecated arguments and functions in crew 0.5.0.9003 (https://github.com/wlnadau/crew/issues/131).
Allow tar_read() etc. inside a pipeline whenever it uses a different data store (#1158, @MilesMcBain).
Set seed = FALSE in future::future() (#1166, @svraka).
Add a new physics argument to tar_visnetwork() and tar_glimpse() (#925, @Bdblodgett-usgs).

Contributors

svraka, gadenbuie, and 2 other contributors

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

targets 1.9.0

Improvements

Potentially invalidating changes

Contributors

targets 1.8.0

Contributors

targets 1.7.1

targets 1.7.0

Invalidating changes

Other improvements

Contributors

targets 1.6.0

Contributors

targets 1.5.1

targets 1.4.1

Contributors

targets 1.4.0

Invalidating changes

Other changes to seeds

Other cloud storage improvements

Other improvements

Contributors

targets 1.3.2

targets 1.3.1

Contributors

Releases: ropensci/targets

Memory efficiency

targets 1.9.0

Improvements

Potentially invalidating changes

Contributors

Content addressable storage

targets 1.8.0

Contributors

bslib and speed

targets 1.7.1

secretbase

targets 1.7.0

Invalidating changes

Other improvements

Contributors

Custom descriptions

targets 1.6.0

Contributors

CRAN patch

targets 1.5.1

Small fixes

targets 1.4.1

Contributors

AWS/crew efficiency, random number safety

targets 1.4.0

Invalidating changes

Other changes to seeds

Other cloud storage improvements

Other improvements

Contributors

CRAN patch

targets 1.3.2

Cloud metadata fixes

targets 1.3.1

Contributors