Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DistGB] enable dist partition pipeline to save FusedCSCSamplingGraph partition directly #7728

Merged
merged 45 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
8f9e639
change a variable
Aug 14, 2024
91792fb
Merge branch 'dmlc:master' into master
CfromBU Aug 20, 2024
bfeb3b4
modify partition test case
Aug 20, 2024
bec2af3
change pr
Aug 20, 2024
241ccd8
change partition
Aug 20, 2024
21b592d
change test_partition.py and partiton.py
CfromBU Aug 21, 2024
4ef95d5
partition
CfromBU Aug 21, 2024
1074f85
change partition
CfromBU Aug 21, 2024
e03376d
change partition internal function
CfromBU Aug 21, 2024
16097f8
dist partition
CfromBU Aug 21, 2024
098742b
modify convert_partition.py
Sep 5, 2024
cde6a64
renew test_partition
Sep 5, 2024
d0de089
change data_shuffle.py
Sep 5, 2024
f91e8c2
Merge branch 'master' into dist_partition
Rhett-Ying Sep 6, 2024
6729bed
change convert_partition.py
Sep 6, 2024
0cd0d97
change convert_partition.py
Sep 6, 2024
d03a323
change convert_partition.py
Sep 6, 2024
a5e478b
change convert_partition.py
Sep 6, 2024
fdbea5e
change code format
Sep 6, 2024
a693a39
test dist partition
Sep 8, 2024
8def095
convert_partition
Sep 8, 2024
296882f
change format
Sep 8, 2024
71140a7
change utils
Sep 8, 2024
3751b58
Merge branch 'master' into dist_partition
Rhett-Ying Sep 9, 2024
3e811ab
change dispatch_data.py
Sep 10, 2024
3804841
[distGB]change test_dist_partition
Sep 10, 2024
00bb70c
modify dispatch_data.py
Sep 10, 2024
45ddaa6
resolve conflic in partition
Sep 10, 2024
5968f43
resolve conflict in partition
Sep 10, 2024
8ca0f89
change partition format
Sep 10, 2024
412cf7d
partition.py
Sep 10, 2024
eb29e14
change partition
Sep 10, 2024
b4e3afd
change partition format
Sep 10, 2024
2e58bad
change partition
Sep 10, 2024
283eacc
change partition
Sep 10, 2024
86a0c99
change dist partition
Sep 12, 2024
da02eb4
fix format problem
Sep 12, 2024
954c4d7
change partition
Sep 12, 2024
b3c1be5
change docstring in test case
Sep 12, 2024
3834358
change partition
Sep 12, 2024
fe751b1
Merge branch 'master' into dist_partition
Rhett-Ying Sep 12, 2024
b29e5a2
change convert_partition.py
Sep 13, 2024
33c6ea8
change cast_various_to_minimum_dtype_gb
Sep 18, 2024
3d44eb9
change format
Sep 18, 2024
305d7f6
change convert_partition.py
Sep 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 34 additions & 18 deletions python/dgl/distributed/partition.py
Original file line number Diff line number Diff line change
Expand Up @@ -1600,8 +1600,6 @@ def _save_graph_gb(part_config, part_id, csc_graph):


def cast_various_to_minimum_dtype_gb(
graph,
part_meta,
num_parts,
indptr,
indices,
Expand All @@ -1610,25 +1608,43 @@ def cast_various_to_minimum_dtype_gb(
ntypes,
node_attributes,
edge_attributes,
part_meta=None,
graph=None,
edge_count=None,
node_count=None,
tot_edge_count=None,
tot_node_count=None,
):
"""Cast various data to minimum dtype."""
if graph is not None:
assert part_meta is not None
tot_edge_count = graph.num_edges()
tot_node_count = graph.num_nodes()
node_count = part_meta["num_nodes"]
edge_count = part_meta["num_edges"]
else:
assert tot_edge_count is not None
assert tot_node_count is not None
assert edge_count is not None
assert node_count is not None

# Cast 1: indptr.
indptr = _cast_to_minimum_dtype(graph.num_edges(), indptr)
indptr = _cast_to_minimum_dtype(tot_edge_count, indptr)
# Cast 2: indices.
indices = _cast_to_minimum_dtype(graph.num_nodes(), indices)
indices = _cast_to_minimum_dtype(tot_node_count, indices)
# Cast 3: type_per_edge.
type_per_edge = _cast_to_minimum_dtype(
len(etypes), type_per_edge, field=ETYPE
)
# Cast 4: node/edge_attributes.
predicates = {
NID: part_meta["num_nodes"],
NID: node_count,
"part_id": num_parts,
NTYPE: len(ntypes),
EID: part_meta["num_edges"],
EID: edge_count,
ETYPE: len(etypes),
DGL2GB_EID: part_meta["num_edges"],
GB_DST_ID: part_meta["num_nodes"],
DGL2GB_EID: edge_count,
GB_DST_ID: node_count,
}
for attributes in [node_attributes, edge_attributes]:
for key in attributes:
Expand Down Expand Up @@ -1779,16 +1795,16 @@ def gb_convert_single_dgl_partition(
)

indptr, indices, type_per_edge = cast_various_to_minimum_dtype_gb(
graph,
part_meta,
num_parts,
indptr,
indices,
type_per_edge,
etypes,
ntypes,
node_attributes,
edge_attributes,
graph=graph,
part_meta=part_meta,
num_parts=num_parts,
indptr=indptr,
indices=indices,
type_per_edge=type_per_edge,
etypes=etypes,
ntypes=ntypes,
node_attributes=node_attributes,
edge_attributes=edge_attributes,
)

csc_graph = gb.fused_csc_sampling_graph(
Expand Down
Loading
Loading