Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: modularize tests #78

Merged
merged 6 commits into from
Nov 26, 2024
Merged

refactor: modularize tests #78

merged 6 commits into from
Nov 26, 2024

Conversation

aykut-bozkurt
Copy link
Collaborator

Single test file gets bigger and bigger. Lets have separate test files for different scenarios.

@aykut-bozkurt aykut-bozkurt force-pushed the aykut/coerce-types-on-read branch 2 times, most recently from 78fc3c9 to ed3e766 Compare November 21, 2024 10:42
`COPY FROM parquet` is too strict when matching Postgres tupledesc schema to the parquet file schema.
e.g. `INT32` type in the parquet schema cannot be read into a Postgres column with `int64` type.
We can avoid this situation by casting arrow array to the array that is expected by the tupledesc
schema, if the cast is possible. We can make use of `arrow-cast` crate, which is in the same project
with `arrow`. Its public api lets us check if a cast possible between 2 arrow types and perform the cast.

To make sure the cast is possible, we need to do 2 checks:
1. arrow-cast allows the cast from "arrow type at the parquet file" to "arrow type at the schema that is
   generated for tupledesc",
2. the cast is meaningful at Postgres. We check if there is an explicit cast from "Postgres type that corresponds
   for the arrow type at Parquet file" to "Postgres type at tupledesc".

With that we can cast between many castable types as shown below:
- INT16 => INT32
- UINT32 => INT64
- FLOAT32 => FLOAT64
- LargeUtf8 => UTF8
- LargeBinary => Binary
- Struct, Array, and Map with castable fields, e.g. [UINT16] => [INT64] or struct {'x': UINT16} => struct {'x': INT64}

**NOTE**: Struct fields must match by name and position to be cast.

Closes #67.
@aykut-bozkurt aykut-bozkurt force-pushed the aykut/coerce-types-on-read branch from ed3e766 to b408fac Compare November 21, 2024 10:57
CopyOptionValue::StringOption("parquet".to_string()),
);

let uri = "/tmp/test.parquet".to_string();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe should use more obscure file names. A user/developer might have an unrelated file named this way and get very confused

}
}

pub(crate) fn test_helper<T: IntoDatum + FromDatum + Debug + PartialEq>(test_table: TestTable<T>) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be nice to have a more descriptive name here now that things are getting more dispersed, e.g. test_export_import_equality test_round_trip_equality

Base automatically changed from aykut/coerce-types-on-read to main November 26, 2024 12:36
@aykut-bozkurt aykut-bozkurt enabled auto-merge (squash) November 26, 2024 13:15
Copy link

codecov bot commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 93.96662% with 235 lines in your changes missing coverage. Please review.

Project coverage is 92.01%. Comparing base (e3b4476) to head (05a7052).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/pgrx_tests/copy_from_coerce.rs 87.86% 118 Missing ⚠️
src/pgrx_tests/copy_type_roundtrip.rs 90.26% 95 Missing ⚠️
src/pgrx_tests/common.rs 90.22% 22 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #78      +/-   ##
==========================================
+ Coverage   91.94%   92.01%   +0.06%     
==========================================
  Files          62       70       +8     
  Lines        8795     8879      +84     
==========================================
+ Hits         8087     8170      +83     
- Misses        708      709       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aykut-bozkurt aykut-bozkurt merged commit fbaeadb into main Nov 26, 2024
6 checks passed
@aykut-bozkurt aykut-bozkurt deleted the aykut/modularize-tests branch November 26, 2024 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants