forked from alibaba/GraphScope
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Committed-by: [email protected] from Dev container
- Loading branch information
1 parent
5a5baeb
commit f9a1522
Showing
9 changed files
with
269 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: wiki # then must have a modern dir under ${data} directory | ||
store_type: mutable_csr # v6d, groot, gart | ||
schema: | ||
vertex_types: | ||
- type_name: user | ||
type_id: 0 | ||
x_csr_params: | ||
max_vertex_num: 5000000 | ||
properties: | ||
- property_id: 0 | ||
property_name: id | ||
property_type: | ||
primitive_type: DT_SIGNED_INT64 | ||
primary_keys: | ||
- id | ||
edge_types: | ||
- type_name: friend | ||
type_id: 0 | ||
vertex_type_pair_relations: | ||
- source_vertex: user | ||
destination_vertex: user | ||
relation: MANY_TO_MANY |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
graph: modern_graph | ||
loading_config: | ||
data_source: | ||
scheme: file # file, oss, s3, hdfs; only file is supported now | ||
import_option: init # append, overwrite, only init is supported now | ||
format: | ||
type: csv | ||
metadata: | ||
delimiter: "," # other loading configuration places here | ||
header_row: false # whether to use the first row as the header | ||
quoting: false | ||
quote_char: '"' | ||
double_quote: true | ||
escape_char: '\' | ||
escaping: false | ||
block_size: 4MB | ||
batch_reader: true | ||
null_values: [""] | ||
|
||
vertex_mappings: | ||
- type_name: user # must align with the schema | ||
inputs: | ||
- vertices.csv | ||
column_mappings: | ||
- column: | ||
index: 0 # can be omitted if the index is the same as the property index | ||
property: id | ||
edge_mappings: | ||
- type_triplet: | ||
edge: friend | ||
source_vertex: user | ||
destination_vertex: user | ||
inputs: | ||
- edges.csv | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
#!/bin/python3 | ||
|
||
import os | ||
import sys | ||
|
||
if __name__ == "__main__": | ||
# Expect a arg of file path | ||
if len(sys.argv) != 4: | ||
print("Usage: python3 preprocess.py <file> <vertex_file> <edge_file>") | ||
sys.exit(1) | ||
# Get the file path | ||
file_path = sys.argv[1] | ||
vertex_file_path = sys.argv[2] | ||
edge_file_path = sys.argv[3] | ||
vertices = set() | ||
edges = [] | ||
# open the file and iterate over the lines | ||
with open(file_path, "r") as file: | ||
for line in file: | ||
# if line starts with #, skip it | ||
if line.startswith("#"): | ||
continue | ||
# split the line by space | ||
parts = line.split() | ||
# if contains two parts, it is a edge | ||
if len(parts) == 2: | ||
vertices.add(parts[0]) | ||
vertices.add(parts[1]) | ||
edges.append(parts) | ||
# write vertices to vertices.csv, and edges to edges.csv | ||
with open(vertex_file_path, "w") as file: | ||
for vertex in vertices: | ||
file.write(vertex + "\n") | ||
with open(edge_file_path, "w") as file: | ||
for edge in edges: | ||
file.write(edge[0] + "," + edge[1] + "\n") | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: wiki # then must have a modern dir under ${data} directory | ||
store_type: mutable_csr # v6d, groot, gart | ||
schema: | ||
vertex_types: | ||
- type_name: user | ||
type_id: 0 | ||
x_csr_params: | ||
max_vertex_num: 5000000 | ||
properties: | ||
- property_id: 0 | ||
property_name: id | ||
property_type: | ||
primitive_type: DT_SIGNED_INT64 | ||
primary_keys: | ||
- id | ||
edge_types: | ||
- type_name: friend | ||
type_id: 0 | ||
vertex_type_pair_relations: | ||
- source_vertex: user | ||
destination_vertex: user | ||
relation: MANY_TO_MANY |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
graph: modern_graph | ||
loading_config: | ||
data_source: | ||
scheme: file # file, oss, s3, hdfs; only file is supported now | ||
import_option: init # append, overwrite, only init is supported now | ||
format: | ||
type: csv | ||
metadata: | ||
delimiter: "," # other loading configuration places here | ||
header_row: false # whether to use the first row as the header | ||
quoting: false | ||
quote_char: '"' | ||
double_quote: true | ||
escape_char: '\' | ||
escaping: false | ||
block_size: 4MB | ||
batch_reader: true | ||
null_values: [""] | ||
|
||
vertex_mappings: | ||
- type_name: user # must align with the schema | ||
inputs: | ||
- vertices.csv | ||
column_mappings: | ||
- column: | ||
index: 0 # can be omitted if the index is the same as the property index | ||
property: id | ||
edge_mappings: | ||
- type_triplet: | ||
edge: friend | ||
source_vertex: user | ||
destination_vertex: user | ||
inputs: | ||
- edges.csv | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
#!/bin/python3 | ||
|
||
import os | ||
import sys | ||
|
||
if __name__ == "__main__": | ||
# Expect a arg of file path | ||
if len(sys.argv) != 4: | ||
print("Usage: python3 preprocess.py <file> <vertex_file> <edge_file>") | ||
sys.exit(1) | ||
# Get the file path | ||
file_path = sys.argv[1] | ||
vertex_file_path = sys.argv[2] | ||
edge_file_path = sys.argv[3] | ||
vertices = set() | ||
edges = [] | ||
# open the file and iterate over the lines | ||
with open(file_path, "r") as file: | ||
for line in file: | ||
# if line starts with #, skip it | ||
if line.startswith("#"): | ||
continue | ||
# split the line by space | ||
parts = line.split() | ||
# if contains two parts, it is a edge | ||
if len(parts) == 2: | ||
vertices.add(parts[0]) | ||
vertices.add(parts[1]) | ||
edges.append(parts) | ||
# write vertices to vertices.csv, and edges to edges.csv | ||
with open(vertex_file_path, "w") as file: | ||
for vertex in vertices: | ||
file.write(vertex + "\n") | ||
with open(edge_file_path, "w") as file: | ||
for edge in edges: | ||
file.write(edge[0] + "," + edge[1] + "\n") | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: wiki # then must have a modern dir under ${data} directory | ||
store_type: mutable_csr # v6d, groot, gart | ||
schema: | ||
vertex_types: | ||
- type_name: article | ||
type_id: 0 | ||
x_csr_params: | ||
max_vertex_num: 5000000 | ||
properties: | ||
- property_id: 0 | ||
property_name: id | ||
property_type: | ||
primitive_type: DT_SIGNED_INT64 | ||
primary_keys: | ||
- id | ||
edge_types: | ||
- type_name: link | ||
type_id: 0 | ||
vertex_type_pair_relations: | ||
- source_vertex: article | ||
destination_vertex: article | ||
relation: MANY_TO_MANY |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
graph: modern_graph | ||
loading_config: | ||
data_source: | ||
scheme: file # file, oss, s3, hdfs; only file is supported now | ||
import_option: init # append, overwrite, only init is supported now | ||
format: | ||
type: csv | ||
metadata: | ||
delimiter: " " # other loading configuration places here | ||
header_row: false # whether to use the first row as the header | ||
quoting: false | ||
quote_char: '"' | ||
double_quote: true | ||
escape_char: '\' | ||
escaping: false | ||
block_size: 4MB | ||
batch_reader: true | ||
null_values: [""] | ||
|
||
vertex_mappings: | ||
- type_name: article # must align with the schema | ||
inputs: | ||
- article.csv | ||
column_mappings: | ||
- column: | ||
index: 0 # can be omitted if the index is the same as the property index | ||
property: id | ||
edge_mappings: | ||
- type_triplet: | ||
edge: link | ||
source_vertex: article | ||
destination_vertex: article | ||
inputs: | ||
- link.csv | ||
source_vertex_mappings: | ||
- column: | ||
index: 0 | ||
destination_vertex_mappings: | ||
- column: | ||
index: 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters