Releases: spotify/scio
Releases · spotify/scio
v0.5.0-alpha2
Breaking changes
BigQueryIO
inJobTest#output
now requires a type parameter. Explicit.map(T.toTableRow)
of test data is no longer needed.- Typed
AvroIO
now accepts case classes instead of Avro records inJobTest
. Explicit.map(T.toGenericRecord)
of test data is no longer needed. See this change for more. - Package
com.spotify.scio.extra.transforms
is moved fromscio-extra
toscio-core
, undercom.spotify.scio.transforms
.
See this section for more details.
Features
- Remove toGenericRecord requirement when testing typed AvrioIO #1022 #1036
- Bump sparkey to 2.2.1, protobuf-generic to 0.2.4 #1028
Bug fixes
v0.5.0-alpha1
Breaking changes
BigQueryIO
inJobTest#output
now requires a type parameter. Explicit.map(T.toTableRow)
of test data is no longer needed.- Package
com.spotify.scio.extra.transforms
is moved fromscio-extra
toscio-core
, undercom.spotify.scio.transforms
.
See this section for more details.
Features
- Support reading BigQuery as Avro #964, #992
- Add TFRecordSpec support for Featran #1002
- Add AsyncLookupDoFn #1012
Bug fixes
v0.4.7
"Hydrochoerus hydrochaeris"
Features
- Add support for TFRecordSpec #990
- Add convenience methods to randomSpit #987
- Add BigtableDoFn #922 #931
- Add optional arguments validation #979
- Performance improvement in Avro and BigQuery macro converters #989
- Update new bigquery datetime and timestamp specification #982
Bug fixes
v0.4.6
"Galago gallarum"
Features
- Upgrade Beam to 2.2.0 #797 #958
- Support dynamic file IO destinations #919 #965
- Support custom Kryo options via PipelineOptions #896 #955
- Propagate input to TensorFlow predict output fn
- Annotate more examples with Socco
- Support compression in TextIO #972
- Use Compression in TFRecordIO #977
- Add TSV examples #974
Bug fixes
- Use window as side input cache key #959 #960
- Use canonical path in macro type providers #975
- Fix deduplication in SCollection#subtract #973
- Fix empty RHS for hashJoin and hashLeftJoin #953
- Fix ClassNotFound issue with ClosureCleaner
- Lift projection in ParquetAvroFile#flatMap
- Add Dataflow runner to scio-examples #963 #968
- Remove deprecated Pubsub ClientAuthInterceptor #957 #962
v0.4.5
"Felis ferus"
Features
- Add PubSub admin helper methods #929
- Add Elasticsearch ensureIndex helper methods #912 #925
- Support drop row range in Bigtable admin #926
- Examples site with Socco annotations
Bug fixes
- Fix dependencies issue with DataflowRunner, #934 #935
- Use SingletonSideInput instead of MultiMapSideInput for hash joins #916 #917
- Use thread safe version of SparkeyReader #941 #944
- Make SCollection#take lazy, fix #938 #940
- Require extra map in Parquet Avro, fix #928 #930
- Tag and exclude slow tests #909 #927
- Fix build for Windows #918 #920 #921
v0.4.4
"Erinaceus europaeus"
Breaking change
Dataflow runner dependency is removed from scio-core
. You need to explicitly add all runner dependencies now. Dataflow specific logic is also removed from ScioResult
. See this page for more details.
Features
- Add bigquery dynamic destinations #876 #73
- Add time condition matchers #883 #458
- Support default value for singleton side input #894
- Allow wrapping internal views as SideInput #897
- Decouple Dataflow runner #779 #882 #850
- Bump Scala 2.12 to 2.12.4 #893
- Bump sbt to 1.0.3 #904
Bug fixes
v0.4.3
"Dendrohyrax dorsalis"
Features
Bug fixes
v0.4.2
"Castor canadensis"
Breaking change
Beam direct runner is no longer a dependency of scio-core
. Add the following dependency if you want to run a pipeline locally. The current beam version is 2.1.0
.
"org.apache.beam" % "beam-runners-direct-java" % beamVersion
Features
- Add flatten and flattenValues to SCollection #842
- Add Annoy side input #783 #812
- Support saving TF Example together with feature spec #816
- Support metrics in JobTest #846 #851
- Use Scala 2.12 for scio-repl #834 #835
- Support custom body in BigQueryType #808
- Add Scio Benchmark #506 #830
- Remove direct runner dependency #777 #852
- Remove DataflowPipelineOptions from ScioILoop #779
- Bump Bigtable dependency #841
- Update Algebird to 0.13.2
Bug fixes
v0.4.1
"Blarina brevicauda"
Features
- Bump Beam to 2.1.0 #633
- Add Parquet Avro read support #794 #801
- Add custom parallelism for Cassandra sink #792 #795
- Make runWithLocalOutput public to allow more generic testing #826
- Add option gettings to BigQuery TableRow #810
- Add tableExists to BigQueryClient #824
- Add enhanced version of TableReference with asTableSpec #822
- Make ArraySemigroup a Monoid #800
- Use Kryo 4.0.1 #818
Bug fixes
v0.4.0
"Atelerix albiventris"
Features
- Add Parquet Avro read support #771
- Add option to set BigQuery priority #759
- Add missing helpers for Date,Time,DateTime #761
- Add example of DoFn usage #766
- Add safeFlatMap #776
- Support custom number of shards in TFRecord output #787
- Support fetching Bigtable cluster size #773
- Support custom Circe Printer for JSON IO #769, #770
- Fail on duplicate IO within the pipeline in JobTest #786
- Coder performance improvements #767
- Move TFRecordIO to tensforflow sub-module
Bug fixes
- Enforce numeric for BigQuery $LATEST partition #780 #781
- Lazy batch Elasticsearch requests #784 #785
- Check isCacheEnabled in BigQueryClient #758
- Support shouldNot with SCollectionMatchers, fix #760 #763
- Fix AvroType schema namespace #774
- Monoid reduceOption should default to zero
- Fix null string in TopWikipediaSessions