Skip to content

Commit

Permalink
Merge pull request #102 from lyrasis/separate-file
Browse files Browse the repository at this point in the history
Further `Split::IntoMultipleColumns` tweaks
  • Loading branch information
kspurgin authored Aug 26, 2022
2 parents 7f59708 + be72f63 commit 44ec561
Show file tree
Hide file tree
Showing 21 changed files with 307 additions and 665 deletions.
45 changes: 32 additions & 13 deletions CHANGELOG.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,41 +20,60 @@ and this project adheres to https://semver.org/spec/v2.0.0.html[Semantic Version

toc::[]

== Planned
=== Release after 2.9.0
* Remove `Clean::DelimiterOnlyFields`
* Remove `CombineValues::AcrossFieldGroup`
* Remove `Reshape::CollapseMultipleFieldsToOneTypedFieldPair`
* Remove `FilterRows::FieldValueGreaterThan`
* Remove `Helpers.delim_only?`
* Remove `Helpers.field_values`
* Remove `multival` and `sep` parameters from `Replace::FieldValueWithStaticMapping` transform
== Planned for a future release

== Unreleased
These changes are merged into the `main` branch but have not yet been tagged as a new version/release.

==== Breaking

==== Added

==== Changed

==== Bugfixes

==== Deleted

==== To be deprecated/Will break in a future version

== Releases

== 3.0.0 - 2022-08-26

==== Breaking
* See the list of deleted transforms, helpers, and params below.
* `Split::IntoMultipleColumns` transform: no longer removes spaces between split segments that end up collapsed left or right. This was a bug, but fixing it could cause jobs relying on that behavior (or introducing subsequent transforms to deal with it) to fail or generate unexpected results.

==== Added
* `Warn::UnlessFieldValueMatches` transform
* `multimode` parameter for `Utils::FieldValueMatcher`
* Support for passing Procs in as file registry entry values (or as a value in a :dest_special_opts Hash).
* Support for passing Procs in as file registry entry values (or as a value in a :dest_special_opts Hash). See [NOTE under "File Registry Data hashes in your ETL application](https://lyrasis.github.io/kiba-extend/file.file_registry_entry.html#file-registry-data-hashes-in-your-etl-application)
* `delim` parameter for `Replace::FieldValueWithStaticMapping` transform

* Added test for the bug correction for `Split:IntoMultipleColumns`

==== Changed
* `Split::IntoMultipleColumns`: If empty string is passed in as the value to be split, all newly created fields will be nil

==== Bugfixes
* `Split::IntoMultipleColumns` no longer removes existing spaces between segments that get right/left collapsed
* Fixes incorrect value splitting in `Split::IntoMultipleColumns`
* `Reshape::FieldsToFieldGroupWithConstant` now works with single source fields (i.e. listed in `fieldmap` param) with nil values

==== Deleted
* Transforms
** `Clean::DelimiterOnlyFields`
** `CombineValues::AcrossFieldGroup`
** `Reshape::CollapseMultipleFieldsToOneTypedFieldPair`
** `FilterRows::FieldValueGreaterThan`
* Transform Helpers
** `Helpers.delim_only?`
** `Helpers.field_values`
* Parameters
** `multival` and `sep` parameters from `Replace::FieldValueWithStaticMapping` transform

==== To be deprecated/Will break in a future version
* `multival` and `sep` parameters for `Replace::FieldValueWithStaticMapping` transform

== Releases
=== 2.9.0 - 2022-07-28
https://github.com/lyrasis/kiba-extend/compare/v2.8.0\...v2.9.0[Compare code changes]

Expand Down
8 changes: 4 additions & 4 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
PATH
remote: .
specs:
kiba-extend (2.9.0)
kiba-extend (3.0.0)
activesupport (~> 6)
amazing_print (~> 1.4)
csv (~> 3)
Expand All @@ -28,13 +28,13 @@ GEM
byebug (11.1.3)
coderay (1.1.3)
concurrent-ruby (1.1.10)
csv (3.2.3)
csv (3.2.5)
diff-lcs (1.5.0)
docile (1.4.0)
dry-configurable (0.15.0)
concurrent-ruby (~> 1.0)
dry-core (~> 0.6)
dry-container (0.10.0)
dry-container (0.10.1)
concurrent-ruby (~> 1.0)
dry-core (0.8.1)
concurrent-ruby (~> 1.0)
Expand All @@ -46,7 +46,7 @@ GEM
measured (2.8.2)
activesupport (>= 5.2)
method_source (1.0.0)
minitest (5.16.2)
minitest (5.16.3)
parallel (1.20.1)
parser (3.0.2.0)
ast (~> 2.4.1)
Expand Down
2 changes: 1 addition & 1 deletion doc/file_registry_entry.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ A file registry entry is initialized with a Hash of data about the file. This Ha

The allowable Hash keys, expected Hash value formats, and expectations about them are described below.

**NOTE:** For all keys besides `:dest_special_opts`, you may pass a Proc that returns the expected value format when called. For `:dest_special_opts`, you may pass Procs as individual values within the option Hash. This can be useful if you need to pass in a value that depends on other project config that may not be loaded/set up when registry is initially populated. A publicly available example is in `kiba-tms` which [sets destination initial headers](https://github.com/lyrasis/kiba-tms/blob/eb8f222f0dc753921e58d136cd15e5eab7472c60/lib/kiba/tms/table/prep/destination_options.rb#L32-L34) [based on the preferred name field for a given TMS client project, and whether they want to include "flipped" form as variant terms](https://github.com/lyrasis/kiba-tms/blob/eb8f222f0dc753921e58d136cd15e5eab7472c60/lib/kiba/tms/constituents.rb#L140-L148).
**NOTE:** (Since 3.0.0) For all keys besides `:dest_special_opts`, you may pass a Proc that returns the expected value format when called. For `:dest_special_opts`, you may pass Procs as individual values within the option Hash. This can be useful if you need to pass in a value that depends on other project config that may not be loaded/set up when registry is initially populated. A publicly available example is in `kiba-tms` which [sets destination initial headers](https://github.com/lyrasis/kiba-tms/blob/eb8f222f0dc753921e58d136cd15e5eab7472c60/lib/kiba/tms/table/prep/destination_options.rb#L32-L34) [based on the preferred name field for a given TMS client project, and whether they want to include "flipped" form as variant terms](https://github.com/lyrasis/kiba-tms/blob/eb8f222f0dc753921e58d136cd15e5eab7472c60/lib/kiba/tms/constituents.rb#L140-L148).

### `:path`
[String] full or expandable relative path to the expected location of the file**
Expand Down
33 changes: 0 additions & 33 deletions lib/kiba/extend/transforms/clean/delimiter_only_fields.rb

This file was deleted.

31 changes: 0 additions & 31 deletions lib/kiba/extend/transforms/combine_values/across_field_group.rb

This file was deleted.

22 changes: 0 additions & 22 deletions lib/kiba/extend/transforms/filter_rows.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,6 @@ module Transforms
# Transformations that remove rows based on different types of conditions
module FilterRows
::FilterRows = Kiba::Extend::Transforms::FilterRows

# @deprecated Convert any uses of this transform in your jobs to
# {Kiba::Extend::Transforms::FilterRows::WithLambda}
class FieldValueGreaterThan
def initialize(action:, field:, value:)
warn("#{self.class.name} will be removed in a future version. Convert to `FilterRows::WithLambda`", category: :deprecated)
@action = action
@field = field
@value = value
end

# @param row [Hash{ Symbol => String }]
def process(row)
val = row.fetch(@field)
case @action
when :keep
val > @value ? row : nil
when :reject
val > @value ? nil : row
end
end
end
end
end
end
Expand Down
38 changes: 0 additions & 38 deletions lib/kiba/extend/transforms/helpers.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,6 @@ module Transforms
# utility functions across Transforms
module Helpers
module_function
# @deprecated in 2.9.0. Use {DelimOnlyChecker} instead. This service
# class has slightly different behavior in that it returns true for `nil` or `empty?` values by default.
# To replicate this exact behavior in {DelimOnlyChecker}, initialize it with `blank_result: false`.
#
# Indicates whether a field value is delimiter-only. If `usenull` is set to true, the
# config.nullvalue string is treated as empty in detecting delimiter-only-ness
# @param val [String] The field value to check
# @param delim [String] The multivalue delimiter
# @param usenull [Boolean, String] If `true`, replaces config.nullvalue string with '' to make
# determination. If `false` no replacement is done. If a String is given, that string is replaced
# with '' to make determination. If Array of Strings given, each String is replaced with ''.
# @return [false] if `value` is nil, empty, or contains characters other than delimiter(s)
# and leading/trailing spaces
# @return [true] if `value` contains only delimiter(s) and leading/trailing spaces
def delim_only?(val, delim, usenull = false)
dep = '`Kiba::Extend::Transforms::Helpers.delim_only?` is deprecated and will be removed in a future release.'
alt = 'Use `Kiba::Extend::Transforms::Helpers::DelimOnlyChecker` service class instead'
used = caller.first
warn("#{Kiba::Extend.warning_label}: #{dep} #{alt}\n Used at: #{used}")
nv = usenull ? Kiba::Extend.nullvalue : nil
DelimOnlyChecker.call(delim: delim, treat_as_null: nv, value: val, blank_result: false)
end

# Indicates whether a given value is empty, ignoring delimiters. If `usenull` is true,
# the config.nullvalue string is treated as empty
Expand All @@ -41,22 +19,6 @@ def empty?(val, usenull = false)
chkval.strip.empty?
end

# @deprecated in 2.9.0. Use {FieldValueGetter} instead.
# @param row [Hash{Symbol=>String,Nil}l] A row of data
# @param fields [Array(Symbol)] Names of fields to process
# @param discard [Array<:nil, :empty, :delim>] Types of field values to remove from returned hash
# @param delim [String] Multivalue delimiter used to split fields
# @param usenull [Boolean] If true, replaces '%NULLVALUE%' with '' to make determination
# @return [Hash{Symbol=>String,Nil}l] of field data for fields that meet keep criteria
def field_values(row:, fields:, discard: %i[nil empty delim], delim: Kiba::Extend.delim, usenull: false)
dep = '`Kiba::Extend::Transforms::Helpers.field_values` is deprecated and will be removed in a future release.'
alt = 'Use `Kiba::Extend::Transforms::Helpers::FieldValueGetter` service class instead'
used = caller.first
warn("#{Kiba::Extend.warning_label}: #{dep} #{alt}\n Used at: #{used}")
nv = usenull ? Kiba::Extend.nullvalue : nil
FieldValueGetter.new(fields: fields, discard: discard, delim: delim, treat_as_null: nv).call(row)
end

# @param field_vals [Hash{Symbol=>String,Nil}l] A subset of a row
# @param discard [:nil, :empty, :delim] Types of field values to remove from returned hash
# @param delim [String] Multivalue delimiter used to split fields
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -177,19 +177,31 @@ module Replace
# ```
class FieldValueWithStaticMapping
class << self
def delim_and_sep_warning
warning = "Both `delim` and deprecated `sep` parameters given. Using `delim` value.\nTO FIX: remove `sep` parameter"
"#{Kiba::Extend.warning_label}: #{self.class.name}: #{warning}"
end

def multival_warning
warning = "`multival` parameter is deprecated. In the future, if a `delim` is given, the transform will operate in multival mode\nTO FIX: remove `multival` parameter"
"#{Kiba::Extend.warning_label}: #{self.class.name}: #{warning}"
def multival_msg
<<~MSG
#{self.name} no longer supports the `multival` parameter.
If a `delim` value is given, the transform will operate in multival mode
TO FIX: remove `multival` parameter, ensuring a `delim` value is given
MSG
end

def sep_warning
warning = "`sep` parameters is deprecated.\nTO FIX: change `sep` to `delim`"
"#{Kiba::Extend.warning_label}: #{self.class.name}: #{warning}"
# Overridden to provide more informative/detailed ArgumentError messages for parameters that are
# removed after not having been deprecated very long.
def new(source:, target: nil, mapping:, fallback_val: :orig, delete_source: true, delim: nil,
multival: nil, sep: nil)
instance = allocate
fail(ArgumentError, sep_msg) if sep
fail(ArgumentError, multival_msg) if multival
instance.send(:initialize, **{source: source, target: target, mapping: mapping, fallback_val: fallback_val,
delete_source: delete_source, delim: delim})
instance
end

def sep_msg
<<~MSG
#{self.name} no longer supports the `sep` parameter
TO FIX: change `sep` to `delim`"
MSG
end
end

Expand All @@ -201,18 +213,14 @@ def sep_warning
# @param delete_source [Boolean] whether to remove source field after mapping. Has no effect if
# a different target field is not given
# @param delim [nil, String] if a value is given, turns on "multival" mode, splitting the whole field
# value on the string given
# @param multival [nil, Boolean] DEPRECATED - DO NOT USE
# @param sep [nil, String] DEPRECATED - DO NOT USE
def initialize(source:, target: nil, mapping:, fallback_val: :orig, delete_source: true, delim: nil,
multival: nil, sep: nil)
# value on the string given (since 3.0.0)
def initialize(source:, target: nil, mapping:, fallback_val: :orig, delete_source: true, delim: nil)
@source = source
@target = target ? target : source
@mapping = mapping
@fallback = fallback_val
@del = delete_source
@delim = set_delim(sep, delim)
warn(self.class.multival_warning) unless multival.nil?
@delim = delim
@multival = true if @delim
end

Expand Down

This file was deleted.

Loading

0 comments on commit 44ec561

Please sign in to comment.