SNOW-1737840: Adapt record mapping in RecordService #969

sfc-gh-wtrefon · 2024-10-22T11:06:03Z

Overview

SNOW-1737840

Pre-review checklist

sfc-gh-wtrefon · 2024-10-22T11:09:35Z

src/main/java/com/snowflake/kafka/connector/records/SnowflakeTableStreamingRecordMapper.java

+    if (includeAllMetadata) {
+      streamingIngestRow.put(TABLE_COLUMN_METADATA, mapper.writeValueAsString(row.getMetadata()));
+    }


not sure if there was some trick here but i moved it from the forEach loop above to here. I don't see any sense in parsing and putting it all over again for every jsonNode. This should improve overall processing speed as we do not serialise it for every json node we have

I think I even saw some tech debt ticket regarding multiple serialization. It's great that you've optimized it.

…g-mapping

sfc-gh-wtrefon · 2024-10-24T08:23:37Z

src/test/java/com/snowflake/kafka/connector/records/IcebergTableStreamingRecordMapperTest.java

+import org.junit.jupiter.params.provider.Arguments;
+import org.junit.jupiter.params.provider.MethodSource;
+
+class IcebergTableStreamingRecordMapperTest {


please think of more test cases if possible with example :)

Looks exhaustive to me, let's wait for the bugs to happen :)

sfc-gh-mbobowski · 2024-10-24T11:33:15Z

src/main/java/com/snowflake/kafka/connector/records/IcebergTableStreamingRecordMapper.java

+
+  @Override
+  public Map<String, Object> processSnowflakeRecord(
+      SnowflakeTableRow row, boolean schematizationEnabled, boolean includeAllMetadata)


nit: maybe just boolean includeMetadata?

sfc-gh-mbobowski · 2024-10-24T11:44:10Z

src/main/java/com/snowflake/kafka/connector/records/IcebergTableStreamingRecordMapper.java

+      String key = fields.next();
+      JsonNode valueNode = headersNode.get(key);
+      String value;
+      if (valueNode.isTextual()) {


nit: extract String getTextualValue(JsonNode node) and reuse in SnowflakeTableStreamingRecordMapper?

sfc-gh-mbobowski · 2024-10-24T12:03:28Z

src/main/java/com/snowflake/kafka/connector/records/RecordService.java

    this.clock = clock;
+    this.enableSchematization = enableSchematization;


I wonder if RecordService needs to be aware of enableSchematization value after creating StreamingRecordMapper abstraction. In fact we could pass this flag to a mapper in a factory instead of processSnowflakeRecord argument.

sfc-gh-mbobowski · 2024-10-24T12:33:39Z

src/test/java/com/snowflake/kafka/connector/streaming/iceberg/sql/PrimitiveJsonRecord.java

@@ -67,8 +123,7 @@ public static List<RecordWithMetadata<PrimitiveJsonRecord>> fromSchematizedResul
                resultSet.getLong("ID_INT8"),
                resultSet.getLong("ID_INT16"),
                resultSet.getLong("ID_INT32"),
-                // FIXME: there is currently some bug in Iceberg when storing int64 values
-                //                resultSet.getLong("ID_INT64"),
+                resultSet.getLong("ID_INT64"),


Thx, please close https://snowflakecomputing.atlassian.net/browse/SNOW-1754474 if it works.

sfc-gh-mbobowski

Left some minor comments but great change overall!

Refactor RecordService

d079526

sfc-gh-wtrefon commented Oct 22, 2024

View reviewed changes

sfc-gh-wtrefon added 7 commits October 22, 2024 15:25

Reuse mapper

d6bec27

Add mapping for iceberg

6ef1ef7

Fix mapping of headers

7a3c56d

Handle null scenario

a9af57b

Add tests

e30c7e0

Add skip metadata test

ea5505b

Merge branch 'master' into wtrefon/SNOW-1737840-record-service-iceber…

92beb53

…g-mapping

sfc-gh-wtrefon commented Oct 24, 2024

View reviewed changes

Add headers

15f2062

sfc-gh-mbobowski reviewed Oct 24, 2024

View reviewed changes

Uncomment int64

ba08b7c

sfc-gh-mbobowski reviewed Oct 24, 2024

View reviewed changes

sfc-gh-mbobowski approved these changes Oct 24, 2024

View reviewed changes

Add complex object test

6197a2e

sfc-gh-wtrefon marked this pull request as ready for review October 24, 2024 13:43

sfc-gh-wtrefon requested a review from a team as a code owner October 24, 2024 13:43

sfc-gh-wtrefon added 3 commits October 24, 2024 16:07

Fix test

c335ca4

CR changes

47f1ddd

Move schematizationEnabled

89b11e6

sfc-gh-wtrefon merged commit 8a7a305 into master Oct 25, 2024
53 of 54 checks passed

sfc-gh-wtrefon deleted the wtrefon/SNOW-1737840-record-service-iceberg-mapping branch October 25, 2024 06:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SNOW-1737840: Adapt record mapping in RecordService #969

SNOW-1737840: Adapt record mapping in RecordService #969

sfc-gh-wtrefon commented Oct 22, 2024

sfc-gh-wtrefon Oct 22, 2024

sfc-gh-mbobowski Oct 24, 2024

sfc-gh-wtrefon Oct 24, 2024

sfc-gh-mbobowski Oct 24, 2024

sfc-gh-mbobowski Oct 24, 2024

sfc-gh-mbobowski Oct 24, 2024

sfc-gh-mbobowski Oct 24, 2024

sfc-gh-wtrefon Oct 24, 2024

sfc-gh-mbobowski Oct 24, 2024

sfc-gh-mbobowski left a comment

		this.clock = clock;
		this.enableSchematization = enableSchematization;

SNOW-1737840: Adapt record mapping in RecordService #969

SNOW-1737840: Adapt record mapping in RecordService #969

Conversation

sfc-gh-wtrefon commented Oct 22, 2024

Overview

Pre-review checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-mbobowski left a comment

Choose a reason for hiding this comment