Skip to content

Commit

Permalink
Loader should not call the "list" Iglu endpoint for Snowflake/Databri…
Browse files Browse the repository at this point in the history
…cks tables

When loading to Redshift, the loader needs to call the "list" Iglu
api endpoint in order to discover all minor versions of a schema. It
uses the list to merge the schemas.

But for Snowflake/Databricks tables the loader does not need to merge
schemas, so there is no need to call the "list" endpoint.

In RDB Loader version 6.0.0 we accidentally changed the loader so it
calls the "list" endpoint regardless of the table type. This is a
problem for Snowflake/Databricks loads because:

- it is inefficient to call a api endpoint when we don't need the result
- some old Iglu repos do not support the list endpoint.
  • Loading branch information
istreeter authored and oguzhanunlu committed Jul 5, 2024
1 parent 4389808 commit 25d6026
Showing 1 changed file with 4 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,10 @@ object DataDiscovery {
.leftMap(er => LoaderError.DiscoveryError(IgluError(s"Error inferring columns names $er")))
}
}
models <- getShredModels[F](nonAtomicTypes)
models <- message.typesInfo match {
case TypesInfo.Shredded(_) => getShredModels[F](nonAtomicTypes)
case TypesInfo.WideRow(_, _) => EitherT.rightT[F, LoaderError](Map.empty[SchemaKey, DiscoveredShredModels])
}
} yield DataDiscovery(message.base, types.distinct, message.compression, message.typesInfo, wideColumns, models)
}

Expand Down

0 comments on commit 25d6026

Please sign in to comment.