Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy/ingestor #1714

Merged
merged 19 commits into from
Aug 21, 2024
Merged

Tidy/ingestor #1714

merged 19 commits into from
Aug 21, 2024

Conversation

shivam-880
Copy link
Collaborator

@shivam-880 shivam-880 commented Aug 15, 2024

What changes were proposed in this pull request?

For the Pandas/Parquet loaders we want to do the following:

  • removed load_from_pandas and load_from_parquet - these had WAY too many arguments and were very confusing.
  • renamed load_edges_deletions_from_parquet to load_edge_deletions_from_parquet and load_edges_deletions_from_pandas to load_edge_deletions_from_pandas to be gramantically correct.
  • Bring the order of required arguments in line with add_node/add_edge - this means instead of src,dst, time it should be time, src, dst. Bring pandas/parquet loaders api args inline with add_nodes/add_edges #1673
  • Any reference to const_properties should be changed to constant_properties
  • Add node_type col to load_node_props_from_pandas/parquet #1674
  • We need to replace layer/layer_in_df with layerand layer_col: Option<&str> and simply raise a runtime error if both are specified at the same time
  • We need to replace node_type/node_type_in_df with node_typeand node_type_col: Option<&str> and simply raise a runtime error if both are specified at the same time
  • Will need to update docs when this is done

Why are the changes needed?

Improving the user experience when using the most typical ways of loading data into Raphtory

Does this PR introduce any user-facing change? If yes is this documented?

Yes and yes

How was this patch tested?

Via the current suite of loader tests

Issues

Fixes #1675
Fixes #1674
Fixes #1673

Are there any further changes required?

The loaders need to be parallelised, now that they are chunked.

@shivam-880 shivam-880 marked this pull request as ready for review August 15, 2024 15:57
# Conflicts:
#	python/tests/test_disk_graph.py
#	raphtory/src/core/utils/errors.rs
#	raphtory/src/io/arrow/dataframe.rs
#	raphtory/src/io/arrow/df_loaders.rs
#	raphtory/src/io/arrow/mod.rs
#	raphtory/src/io/arrow/prop_handler.rs
#	raphtory/src/io/parquet_loaders.rs
#	raphtory/src/python/graph/disk_graph.rs
#	raphtory/src/python/graph/graph.rs
#	raphtory/src/python/graph/graph_with_deletions.rs
#	raphtory/src/python/graph/io/pandas_loaders.rs
# Conflicts:
#	python/python/raphtory/__init__.pyi
#	python/python/raphtory/graphql/__init__.pyi
#	raphtory/src/python/graph/graph_with_deletions.rs
@miratepuffin miratepuffin merged commit caa8927 into master Aug 21, 2024
19 checks passed
@miratepuffin miratepuffin deleted the tidy/ingestor branch August 21, 2024 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants