Releases: spotify/scio
Releases · spotify/scio
v0.7.3
"Vulpes Vulpes"
Bug Fixes & Improvements
- Fix FileStorage.avroFile (#1727)
- Fix perf regression in Coder (#1729)
- Reduce the size of the captured stacktrace in WrappedBCoder (#1745)
- Fix #1734: Limit job graph size by not wrapping native beam coders (#1741)
- Explicit reset position on SeekableInput (#1747)
- Support scalatest NotWord (#1743)
- make BigQuery priority sysprop case-insensitive (#1736)
- Use getSchema and avoid reflection when creating AvroCoder (#1724)
- Clarify error message when a job uses an input multiple times. (#1720)
- tiny typos in Coders.md (#1732)
- Incorrect generic type in ScalaDoc (#1725)
- Use BenchmarkResult as entity (#1712)
v0.7.2
"Ursus t. Ussuricus"
Features
- Update Beam to 2.10 (#1674, #1676)
- Clearer
Coder
exceptions (#1672) - Use new HadoopFormatIO (#1675)
- Add spanner MutationGroup coder (#1704)
- Optimize CombineFn's (speeds up
aggregate
-,reduce
-, andcombine
-based operations!) (#1699) - Use list side input on cross product (#1691)
- Fix DistinctBy serialization for Scala Classes (#1710, #1715)
- Remove deprecation warning on tfRecordExampleFileWithSchema (#1714)
- Cleanup around scio context (#1679)
- Version bumps: cassandra-all -> 2.2.14 (#1677), 3.11.4 (#1678); Sparkey -> 3.0.0 (#1690), ES5 -> 5.6.15, ES6 -> 6.6.1 (#1700); tensorflow -> 1.13.1 (#1707); scalatest -> 3.0.6 (#1709); featran-* -> 0.3.0 (#1713)
Bug fixes
- Fix Magnolia generated tree annotations removal to ensure Derived coders are serializable (#1673)
v0.7.1
"Taxidea Taxus"
Features
- New HashCode-based partitioning method for keyed SCollections (#1654)
- New Coder for
java.lang.ArrayList
(#1649), and more space-efficient coders for small ADTs likeEither
andTry
(#1652) - new
BinaryIO
output (#1663) - Simpler, clearer
toString
method for Coders (#1671) - Custom
Assertion
s for unit testing Coders added toscio-test
package (#1642) - New
SideMap
andSideSet
SideInput types, usable inhashFullOuterJoin
,hashIntersectByKey
, andhashFilter
methods - Library version bumps: mysql-connector-java -> 8.0.15 (#1653), mysql-socket-factory -> 1.0.12 (#1627), protobuf-java -> 3.6.1 (#1633), hadoop-client -> 2.7.7 (#1634), jackson-module-scala -> 2.9.8 (#1632), parquet-avro -> 1.10.1 (#1648), kantan.csv -> 0.5.0 (#1647)
Bug fixes & Improvements
v0.7.0
"Suricata suricatta"
Breaking changes
- See v0.7.0 Migration Guide for detailed instructions
- New Magnolia based Coders derivation replaces
ClassTag
and Kryo - New ScioIO replaces
TestIO[T]
to simplify IO implementation and stubbing inJobTest
- Update dynamic file destination API #1305
- Remove deprecated TensorFlow graph prediction method #1370
- Object file IO is no longer backwards compatible due to coder changes
- Refactor bigquery client (#1439)
Features
- Macro based coder derivation for magnitudes faster (de)serialization (#1454)(#1394)(#1440)(#1434)(#1427)(#1412)(#1401)(#1429)(#1554)(#1494)(#1438)(#1605)(#1612)
- Redesigned unified
ScioIO[T]
for all IO modules - Add
SCollection#{readAll,readAllBytes}
(splittable DoFn support) #796 #1363 - Sparse lookups (#1398)(#1354)(#1393)
- Add sparse left and right outer joins #1386
- Check and warn chained joins #1362
- Support Parquet compression #1189 #1318
- Port Parquet IO to Parquet 1.10 #1340 #1345
- Configurable fetch and batch size for JDBC IO #1314
- Add PubSubIO batch size write params (#1433)
- Improve coders messages
- Add BigQueryType typesafe args (#1476)(#1431)
- support PubsubMessage in PubsubIO (#1395)
- Add subscription function to PubSubAdmin(#1483)
- Register sys.props (#1404)(#1406)
- Make typed and default args parsing logic more test friendly (#1421)
- Add Google Spanner package (#1491)
- Add BigQuery TimePartitioning support, fix #1419 (#1466)
- Add Numeric type support in scio-bigquery (#1599)
- Add scalafix rules (#1435)(#1464)(#1474)(#1468)(#1470)
- Expose transform function (#1492)(#1487)
- Allow creating
DataflowResult
from dfJob
(#1481) - Remove Future.failed in IOs (#1482)
- Add better error messages when missing sys.props (#1488)(#1461)
- Avoid second sql legacy check when using extractTables query op (#1508)
- Add support for more WriteDisposition's in bigquery writeRows (#1511)
- Add call site transform name in union all (#1499)
- Update apache beam to 2.9.0 (#1580)
- Updated other dependencies (#1589)(#1586)(#1578)(#1579)(#1489)(#1544)(#1520)(#1539)(#1534)(#1533)(#1512)(#1517)(#1531)(#1521)(#1532)(#1538)(#1540)(#1526)(#1529)(#1518)(#1519)(#1536)(#1513)(#1530)(#1535)(#1527)(#1525)(#1524)(#1523)(#1514)(#1515)(#1516)(#1537)(#1509)(#1510)(#1565)(#1432)(#1614)
- Add elasticsearch 6 (#1572)
- Improve AvroType.toSchema annotation error if a case class is not provided (#1609)
- New scio website (#1610)
Bug fixes & Improvements
- Make
PTransform
names unique #1355 #1387 - Fail for unknown args in
ContextAndArgs.typed[T]
(#1413) - Fix verifyNondeterministic exception in coders (#1418)
- Fix BigQueryType on refined types (#1424)
- Fix mergeAccumulators crash (#1428)
- Set timestamp attribute in JobTest for PubSubIO (#1417)
- Rework Coder's implicit not found message (again) (#1469)
- Fix KryoRegistrar scope widenning (#1462)
- Make compression options in ExtractOps typed (#1449) (#1457)
- Add back BigQuery schema caching, regression of #1439 (#1458)
- Register default file systems in Scio test context (fix #1455) (#1463)
- Use coherent defaults accross IO (#1478)
- Fix scio-repl to use refactored BigQuery client (#1459)
- Typed argument parsing is broken when name contains camelCase. (#1460)
- Pubusb topic name was not being set for Messages (#1568)
- Fix macro generated class directory (#1558)
- Fix stack overflow when maxByKey is used with explicit ordering (#1560)
- Fix id and timestamp attributes not being passed in saveAsPubsub (#1559)
- Fix flatten type inference changing the coder context bound to an implicit parameter(#1551)
- Fix: use CodeMaterializer in SideOutputCollections (#1548)
- Default to disabled warning on coders (#1588)
- Use alternative to deprecated write method (#1592)
- Simplify BigQueryType query method arg type parsing (#1585)
- Add rules for TextIO, AvroIO, PubsubIO and BigQueryIO (#1577)
- #1587: Fix sideoutput potentialy missing coder (#1598)
- Add region to DataflowResult (#1479)
- Remove unused autovalue dependency (#1575)
v0.7.0-beta3
Bug fixes & Improvements
- Default to disabled warning on coders (#1588)
- Use alternative to deprecated write method (#1592)
- Simplify BigQueryType query method arg type parsing (#1585)
- Add rules for TextIO, AvroIO, PubsubIO and BigQueryIO (#1577)
- #1587: Fix sideoutput potentialy missing coder (#1598)
- Update beam-runners-direct-java, ... to 2.9.0 (#1580)
- Update annoy4s to 0.8.0 (#1579)
- Update zoltar-api, zoltar-tensorflow to 0.5.1 (#1578)
- Update circe-core, circe-generic, ... to 0.11.0 (#1586)
- Update guava to 25.1-jre (#1589)
- Remove unused autovalue dependency (#1575)
Features
v0.7.0-beta2
Bug Fixes
- Pubusb topic name was not being set for Messages (#1568)
- Fix macro generated class directory (#1558)
- Fix stack overflow when maxByKey is used with explicit ordering (#1560)
- Fix id and timestamp attributes not being passed in saveAsPubsub (#1559)
- Fix flatten type inference changing the coder context bound to an implicit parameter(#1551)
- Fix: use CodeMaterializer in SideOutputCollections (#1548)
Features
- Improve the implicitNotFound message on Coder (#1554)
- Add coursier for fast dep resolution (#1546)
- Avoid second sql legacy check when using extractTables query op (#1508)
- Add support for more WriteDisposition's in bigquery writeRows (#1511)
- Add call site transform name in union all (#1499)
- Avoid using kryo coder for TF Schema and Feature (#1494)
- Updated dependencies (#1544) (#1520) (#1539) (#1534) (#1533(#1512) (#1517) (#1531) (#1521) (#1532)(#1538) (#1540) (#1526) (#1529) (#1518) (#1519) (#1536) (#1513) (#1530) (#1535)(#1527) (#1525) (#1524)(#1523) (#1514) (#1515) (#1516) (#1537) (#1509) (#1510) (#1565)
v0.7.0-beta1
Bug fixes
- Rework Coder's implicit not found message (again) (#1469)
- Fix KryoRegistrar scope widenning (#1462)
- Make compression options in ExtractOps typed (#1449) (#1457)
- Add back BigQuery schema caching, regression of #1439 (#1458)
- Register default file systems in Scio test context (fix #1455) (#1463)
- Use coherent defaults accross IO (#1478)
- Fix scio-repl to use refactored BigQuery client (#1459)
- Typed argument parsing is broken when name contains camelCase. (#1460)
Features
- Add Google Spanner package (#1491)
- Add BigQuery TimePartitioning support, fix #1419 (#1466)
- Add subscription function to PubSubAdmin(#1483)
- Bump Beam to 2.8.0 (#1493)
- Update dependencies (#1489)
- Add scalafix rules (#1435)(#1464)(#1474)(#1468)(#1470)
- Add BigQueryType typesafe args (#1476)
- Add region to DataflowResult (#1479)
- Expose transform function (#1492)(#1487)
- Allow creating
DataflowResult
from dfJob
(#1481) - Remove Future.failed in IOs (#1482)
- Add better error messages when missing sys.props (#1488)(#1461)
v0.7.0-alpha2
Bug Fixes:
- Fail for unknown args in ContextAndArgs.typed[T] (#1413)
- Fix verifyNondeterministic exception in coders (#1418)
- Fix BigQueryType on refined types (#1424)
- Fix mergeAccumulators crash (#1428)
- Set timestamp attribute in JobTest for PubSubIO (#1417)
Features:
- Upgrade to beam 2.7 (#1430)
- Bump tensorflow to 1.11.0 (#1432)
- Add PubSubIO batch size write params (#1433)
- Improve coders messages (#1454)
- Add coders (#1394)(#1440)(#1434)(#1427)(#1412)(#1401)(#1429)
- Add better support for parameterised queries in @BigQueryType.fromQuery (#1431)
- support PubsubMessage in PubsubIO (#1395)
- Register sys.props (#1404)(#1406)
- Make typed and default args parsing logic more test friendly (#1421)
- Sparse lookups (#1398)(#1354)(#1393)
Breaking Changes:
v0.7.0-alpha1
Breaking changes
- See v0.7.0 Migration Guide for detailed instructions
- New Magnolia based Coders derivation replaces
ClassTag
and Kryo - New ScioIO replaces
TestIO[T]
to simplify IO implementation and stubbing inJobTest
- Update dynamic file destination API #1305
- Remove deprecated TensorFlow graph prediction method #1370
- Object file IO is no longer backwards compatible due to coder changes
Features
- Magnolia & macro based coder for magnitudes faster (de)serialization
- Redesigned unified
ScioIO[T]
for all IO modules - Add
SCollection#{readAll,readAllBytes}
(splittable DoFn support) #796 #1363 - Add sparse left and right outer joins #1386
- Check and warn chained joins #1362
- Support Parquet compression #1189 #1318
- Port Parquet IO to Parquet 1.10 #1340 #1345
- Configurable fetch and batch size for JDBC IO #1314