-
Hi @rusty1s I am trying to perform node classification and I use the following transform on the cora dataset - transform2 = RandomNodeSplit(split="test_rest",num_splits = 10)
data = transform2(data)
print(data) If I do - train_nodes = data.train_mask.nonzero(as_tuple=True)[0].cpu().numpy()
test_nodes = data.test_mask.nonzero(as_tuple=True)[0].cpu().numpy()
leakage_nodes = np.intersect1d(train_nodes, test_nodes)
if len(leakage_nodes) > 0:
print(f"Warning: Found {len(leakage_nodes)} nodes in both the training and test sets.")
else:
print("No leakage detected.") I get -
Am I using the transform incorrectly? |
Beta Was this translation helpful? Give feedback.
Answered by
rusty1s
May 27, 2024
Replies: 1 comment 1 reply
-
Don't you need to check every the splits in isolation? import numpy as np
from torch_geometric.datasets import Planetoid
from torch_geometric.transforms import RandomNodeSplit
dataset = Planetoid('/tmp/Cora', name='Cora')
data = dataset[0]
transform2 = RandomNodeSplit(split="test_rest", num_splits=10)
data = transform2(data)
for i in range(10):
train_nodes = data.train_mask[:, i].nonzero(as_tuple=True)[0].cpu().numpy()
test_nodes = data.test_mask[:, i].nonzero(as_tuple=True)[0].cpu().numpy()
leakage_nodes = np.intersect1d(train_nodes, test_nodes)
if len(leakage_nodes) > 0:
print(
f"Warning: Found {len(leakage_nodes)} nodes in both the training and test sets."
)
else:
print("No leakage detected.")
|
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
AdarshMJ
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Don't you need to check every the splits in isolation?