Releases · spotify/scio

12 Mar 20:09

regadas

v0.7.3

40648bf

v0.7.3

"Vulpes Vulpes"

Bug Fixes & Improvements

Fix FileStorage.avroFile (#1727)
Fix perf regression in Coder (#1729)
Reduce the size of the captured stacktrace in WrappedBCoder (#1745)
Fix #1734: Limit job graph size by not wrapping native beam coders (#1741)
Explicit reset position on SeekableInput (#1747)
Support scalatest NotWord (#1743)
make BigQuery priority sysprop case-insensitive (#1736)
Use getSchema and avoid reflection when creating AvroCoder (#1724)
Clarify error message when a job uses an input multiple times. (#1720)
tiny typos in Coders.md (#1732)
Incorrect generic type in ScalaDoc (#1725)
Use BenchmarkResult as entity (#1712)

Assets 3

04 Mar 17:57

clairemcginty

v0.7.2

2efa0d7

v0.7.2

"Ursus t. Ussuricus"

Features

Update Beam to 2.10 (#1674, #1676)
Clearer Coder exceptions (#1672)
Use new HadoopFormatIO (#1675)
Add spanner MutationGroup coder (#1704)
Optimize CombineFn's (speeds up aggregate-, reduce-, and combine-based operations!) (#1699)
Use list side input on cross product (#1691)
Fix DistinctBy serialization for Scala Classes (#1710, #1715)
Remove deprecation warning on tfRecordExampleFileWithSchema (#1714)
Cleanup around scio context (#1679)
Version bumps: cassandra-all -> 2.2.14 (#1677), 3.11.4 (#1678); Sparkey -> 3.0.0 (#1690), ES5 -> 5.6.15, ES6 -> 6.6.1 (#1700); tensorflow -> 1.13.1 (#1707); scalatest -> 3.0.6 (#1709); featran-* -> 0.3.0 (#1713)

Bug fixes

Fix Magnolia generated tree annotations removal to ensure Derived coders are serializable (#1673)

Assets 3

08 Feb 18:45

clairemcginty

v0.7.1

7f5ed7e

v0.7.1

"Taxidea Taxus"

Features

New HashCode-based partitioning method for keyed SCollections (#1654)
New Coder for java.lang.ArrayList (#1649), and more space-efficient coders for small ADTs like Either and Try (#1652)
new BinaryIO output (#1663)
Simpler, clearer toString method for Coders (#1671)
Custom Assertions for unit testing Coders added to scio-test package (#1642)
New SideMap and SideSet SideInput types, usable in hashFullOuterJoin, hashIntersectByKey, and hashFilter methods
Library version bumps: mysql-connector-java -> 8.0.15 (#1653), mysql-socket-factory -> 1.0.12 (#1627), protobuf-java -> 3.6.1 (#1633), hadoop-client -> 2.7.7 (#1634), jackson-module-scala -> 2.9.8 (#1632), parquet-avro -> 1.10.1 (#1648), kantan.csv -> 0.5.0 (#1647)

Bug fixes & Improvements

Optimized Bloom filter aggregations in sparse joins (#1644)
Spanner-specific Coders repackaged from scio-core to scio-spanner (#1630)
Fallback coder always uses Kryo (#1668) and RichCoderRegistry is removed (#1670)

Assets 3

18 Jan 19:48

regadas

v0.7.0

1cb4e70

v0.7.0

"Suricata suricatta"

Breaking changes

See v0.7.0 Migration Guide for detailed instructions
New Magnolia based Coders derivation replaces ClassTag and Kryo
New ScioIO replaces TestIO[T] to simplify IO implementation and stubbing in JobTest
Update dynamic file destination API #1305
Remove deprecated TensorFlow graph prediction method #1370
Object file IO is no longer backwards compatible due to coder changes
Refactor bigquery client (#1439)

Features

Macro based coder derivation for magnitudes faster (de)serialization (#1454)(#1394)(#1440)(#1434)(#1427)(#1412)(#1401)(#1429)(#1554)(#1494)(#1438)(#1605)(#1612)
Redesigned unified ScioIO[T] for all IO modules
Add SCollection#{readAll,readAllBytes} (splittable DoFn support) #796 #1363
Sparse lookups (#1398)(#1354)(#1393)
Add sparse left and right outer joins #1386
Check and warn chained joins #1362
Support Parquet compression #1189 #1318
Port Parquet IO to Parquet 1.10 #1340 #1345
Configurable fetch and batch size for JDBC IO #1314
Add PubSubIO batch size write params (#1433)
Improve coders messages
Add BigQueryType typesafe args (#1476)(#1431)
support PubsubMessage in PubsubIO (#1395)
Add subscription function to PubSubAdmin(#1483)
Register sys.props (#1404)(#1406)
Make typed and default args parsing logic more test friendly (#1421)
Add Google Spanner package (#1491)
Add BigQuery TimePartitioning support, fix #1419 (#1466)
Add Numeric type support in scio-bigquery (#1599)
Add scalafix rules (#1435)(#1464)(#1474)(#1468)(#1470)
Expose transform function (#1492)(#1487)
Allow creating DataflowResult from df Job (#1481)
Remove Future.failed in IOs (#1482)
Add better error messages when missing sys.props (#1488)(#1461)
Avoid second sql legacy check when using extractTables query op (#1508)
Add support for more WriteDisposition's in bigquery writeRows (#1511)
Add call site transform name in union all (#1499)
Update apache beam to 2.9.0 (#1580)
Updated other dependencies (#1589)(#1586)(#1578)(#1579)(#1489)(#1544)(#1520)(#1539)(#1534)(#1533)(#1512)(#1517)(#1531)(#1521)(#1532)(#1538)(#1540)(#1526)(#1529)(#1518)(#1519)(#1536)(#1513)(#1530)(#1535)(#1527)(#1525)(#1524)(#1523)(#1514)(#1515)(#1516)(#1537)(#1509)(#1510)(#1565)(#1432)(#1614)
Add elasticsearch 6 (#1572)
Improve AvroType.toSchema annotation error if a case class is not provided (#1609)
New scio website (#1610)

Bug fixes & Improvements

Make PTransform names unique #1355 #1387
Fail for unknown args in ContextAndArgs.typed[T] (#1413)
Fix verifyNondeterministic exception in coders (#1418)
Fix BigQueryType on refined types (#1424)
Fix mergeAccumulators crash (#1428)
Set timestamp attribute in JobTest for PubSubIO (#1417)
Rework Coder's implicit not found message (again) (#1469)
Fix KryoRegistrar scope widenning (#1462)
Make compression options in ExtractOps typed (#1449) (#1457)
Add back BigQuery schema caching, regression of #1439 (#1458)
Register default file systems in Scio test context (fix #1455) (#1463)
Use coherent defaults accross IO (#1478)
Fix scio-repl to use refactored BigQuery client (#1459)
Typed argument parsing is broken when name contains camelCase. (#1460)
Pubusb topic name was not being set for Messages (#1568)
Fix macro generated class directory (#1558)
Fix stack overflow when maxByKey is used with explicit ordering (#1560)
Fix id and timestamp attributes not being passed in saveAsPubsub (#1559)
Fix flatten type inference changing the coder context bound to an implicit parameter(#1551)
Fix: use CodeMaterializer in SideOutputCollections (#1548)
Default to disabled warning on coders (#1588)
Use alternative to deprecated write method (#1592)
Simplify BigQueryType query method arg type parsing (#1585)
Add rules for TextIO, AvroIO, PubsubIO and BigQueryIO (#1577)
#1587: Fix sideoutput potentialy missing coder (#1598)
Add region to DataflowResult (#1479)
Remove unused autovalue dependency (#1575)

Assets 3

08 Jan 09:22

regadas

v0.7.0-beta3

ac600fd

v0.7.0-beta3 Pre-release

Pre-release

Bug fixes & Improvements

Default to disabled warning on coders (#1588)
Use alternative to deprecated write method (#1592)
Simplify BigQueryType query method arg type parsing (#1585)
Add rules for TextIO, AvroIO, PubsubIO and BigQueryIO (#1577)
#1587: Fix sideoutput potentialy missing coder (#1598)
Update beam-runners-direct-java, ... to 2.9.0 (#1580)
Update annoy4s to 0.8.0 (#1579)
Update zoltar-api, zoltar-tensorflow to 0.5.1 (#1578)
Update circe-core, circe-generic, ... to 0.11.0 (#1586)
Update guava to 25.1-jre (#1589)
Remove unused autovalue dependency (#1575)

Features

Add elasticsearch 6 (#1572)
Add Numeric type support in scio-bigquery (#1599)

Assets 2

06 Dec 19:03

andrisnoko

v0.7.0-beta2

740d936

v0.7.0-beta2 Pre-release

Pre-release

Bug Fixes

Pubusb topic name was not being set for Messages (#1568)
Fix macro generated class directory (#1558)
Fix stack overflow when maxByKey is used with explicit ordering (#1560)
Fix id and timestamp attributes not being passed in saveAsPubsub (#1559)
Fix flatten type inference changing the coder context bound to an implicit parameter(#1551)
Fix: use CodeMaterializer in SideOutputCollections (#1548)

Features

Improve the implicitNotFound message on Coder (#1554)
Add coursier for fast dep resolution (#1546)
Avoid second sql legacy check when using extractTables query op (#1508)
Add support for more WriteDisposition's in bigquery writeRows (#1511)
Add call site transform name in union all (#1499)
Avoid using kryo coder for TF Schema and Feature (#1494)
Updated dependencies (#1544) (#1520) (#1539) (#1534) (#1533(#1512) (#1517) (#1531) (#1521) (#1532)(#1538) (#1540) (#1526) (#1529) (#1518) (#1519) (#1536) (#1513) (#1530) (#1535)(#1527) (#1525) (#1524)(#1523) (#1514) (#1515) (#1516) (#1537) (#1509) (#1510) (#1565)

Assets 2

06 Dec 20:52

regadas

v0.7.0-beta1

5bd8d4e

v0.7.0-beta1 Pre-release

Pre-release

Bug fixes

Rework Coder's implicit not found message (again) (#1469)
Fix KryoRegistrar scope widenning (#1462)
Make compression options in ExtractOps typed (#1449) (#1457)
Add back BigQuery schema caching, regression of #1439 (#1458)
Register default file systems in Scio test context (fix #1455) (#1463)
Use coherent defaults accross IO (#1478)
Fix scio-repl to use refactored BigQuery client (#1459)
Typed argument parsing is broken when name contains camelCase. (#1460)

Features

Add Google Spanner package (#1491)
Add BigQuery TimePartitioning support, fix #1419 (#1466)
Add subscription function to PubSubAdmin(#1483)
Bump Beam to 2.8.0 (#1493)
Update dependencies (#1489)
Add scalafix rules (#1435)(#1464)(#1474)(#1468)(#1470)
Add BigQueryType typesafe args (#1476)
Add region to DataflowResult (#1479)
Expose transform function (#1492)(#1487)
Allow creating DataflowResult from df Job (#1481)
Remove Future.failed in IOs (#1482)
Add better error messages when missing sys.props (#1488)(#1461)

Assets 2

17 Oct 16:55

regadas

v0.7.0-alpha2

f9afe13

v0.7.0-alpha2 Pre-release

Pre-release

Bug Fixes:

Fail for unknown args in ContextAndArgs.typed[T] (#1413)
Fix verifyNondeterministic exception in coders (#1418)
Fix BigQueryType on refined types (#1424)
Fix mergeAccumulators crash (#1428)
Set timestamp attribute in JobTest for PubSubIO (#1417)

Features:

Upgrade to beam 2.7 (#1430)
Bump tensorflow to 1.11.0 (#1432)
Add PubSubIO batch size write params (#1433)
Improve coders messages (#1454)
Add coders (#1394)(#1440)(#1434)(#1427)(#1412)(#1401)(#1429)
Add better support for parameterised queries in @BigQueryType.fromQuery (#1431)
support PubsubMessage in PubsubIO (#1395)
Register sys.props (#1404)(#1406)
Make typed and default args parsing logic more test friendly (#1421)
Sparse lookups (#1398)(#1354)(#1393)

Breaking Changes:

Refactor bigquery client (#1439)
Move all coders to scio-core and rename scio-coders-macros to scio-macros (#1438)

Assets 3

20 Sep 06:11

nevillelyh

v0.7.0-alpha1

45dbb3c

v0.7.0-alpha1 Pre-release

Pre-release

Breaking changes

See v0.7.0 Migration Guide for detailed instructions
New Magnolia based Coders derivation replaces ClassTag and Kryo
New ScioIO replaces TestIO[T] to simplify IO implementation and stubbing in JobTest
Update dynamic file destination API #1305
Remove deprecated TensorFlow graph prediction method #1370
Object file IO is no longer backwards compatible due to coder changes

Features

Magnolia & macro based coder for magnitudes faster (de)serialization
Redesigned unified ScioIO[T] for all IO modules
Add SCollection#{readAll,readAllBytes} (splittable DoFn support) #796 #1363
Add sparse left and right outer joins #1386
Check and warn chained joins #1362
Support Parquet compression #1189 #1318
Port Parquet IO to Parquet 1.10 #1340 #1345
Configurable fetch and batch size for JDBC IO #1314

Bug fixes

Make PTransform names unique #1355 #1387

Assets 3

12 Sep 15:27

nevillelyh

v0.6.1

a3134cd

v0.6.1

"Rhyncholestes raphanurus"

Features

Expose ScioResult#isTest #1336
Meaningful NPE message for nulls in BigQueryType macro #1303 #1332
Add streaming benchmark #1294 #1333 #1338 #1353
Bump ASM version #1358

Bug fixes

Make PCollection names unique #1356
Register joda-time serializers #1341 #1347
Fix duplicate jars in classpath #1334 #1348
Use location-aware Dataflow job endpoints #1337
Change DoFnWithResource logging level to DEBUG #1351
Cache schema in AvroType macro #1025 #1359

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Fixes & Improvements

Features

Bug fixes

Features

Bug fixes & Improvements

Breaking changes

Features

Bug fixes & Improvements

Bug fixes & Improvements

Features

Bug Fixes

Features

Bug fixes

Features

Bug Fixes:

Features:

Breaking Changes:

Breaking changes

Features

Bug fixes

Features

Bug fixes

Releases: spotify/scio

v0.7.3

Bug Fixes & Improvements

v0.7.2

Features

Bug fixes

v0.7.1

Features

Bug fixes & Improvements

v0.7.0

Breaking changes

Features

Bug fixes & Improvements

v0.7.0-beta3

Bug fixes & Improvements

Features

v0.7.0-beta2

Bug Fixes

Features

v0.7.0-beta1

Bug fixes

Features

v0.7.0-alpha2

Bug Fixes:

Features:

Breaking Changes:

v0.7.0-alpha1

Breaking changes

Features

Bug fixes

v0.6.1

Features

Bug fixes