-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give Hugr edges unique IDs #1535
Comments
We don't really have binary edges at the momentThe term edge might not be entirely ideal since the connections between ports do not quite map onto (binary) edges in the classical sense. Rather, there exists a partition of the set of ports into disjoint subsets. It might be more instructive to draw your copy example in this way: The difference of this to binary edges that connect ports becomes clearer in this situation: I'd be very reluctant to change that model. It might not be immediately natural when thought of as a graph, but I think it is a great fit for an IR. The hyperedges correspond to variables. An output port produces a value for that variable while an input port consumes a value for that variable. It also has a very natural description as a database so that pattern matching can be done via conjunctive queries. It's basically a directed version of a hypergraph category. Embeddings vs SubgraphsLet's denote your two patterns as There's a separate point that the map What do you want to do?Perhaps it would help to use pairs of connected ports as your "edge id"? Could you write down a rewrite that would be hard to express? I'm struggling to help you without knowing the problem you're ultimately trying to solve. |
I absolutely understand your points and agree with them. I am simply trying to figure out how these considerations and properties of the hugr IR should be turned into a practical interface for rewriting. I'm by now convinced that our current code in Hugr is wrong. Assigning edge IDs was my simplest, and somewhat naive, solution.
That is a matter of definition: the ports
I want to know what is the minimum amount of data that defines a rewrite. DPO rewriting in the context of Set (and thus Graph) in essence tells us that a rewrite can always be expressed by two sets You are right that the combination of the pattern ProposalIs it safe to make the following two assumptions? If so it sounds like I can define rewrites by three sets of ports (along with data on how they are to be rewired):
The latter is what allows me to match a pattern with a binary copy on a ternary copy at the boundary. I don't know that this holds in other cases (CFG?) -- in which case defining the conditions of what is a valid set of boundary ports becomes dependent on node types (or maybe just edge direction?). How would this be specified? |
Indeed. I was working with the following definition, which most closely matches what the implementation does:
so that each With respect to the associativity assumption: yes. For dataflow graphs, hyperedges are essentially the spiders that arise from a cocommutative copy comonoid. For control flow graphs, they are the spiders for the commutative monoid on branches, where multiplication is control flow merges and the unit is a branch that is never taken. Overall, it'd probably be safe to work with it as if each hyperedge was the spider for a special commutative Frobenius algebra. Valid rewrites of data flow should preserve the comonoid fragment, and valid rewrites of control flow the monoid fragment. (This would be significantly easier if we used monogamous hypergraphs with explicit copies and control flow merges, which is one of the reasons why I'd prefer that model. But so far I've not managed to convince the others of this.) BoundariesA boundary of a pattern should not consist of ports, but of hyperedges, I think. Why the subsets?It appears important to you that matches are expressed uniquely by subsets, instead of embeddings or maps in general. Why? Any pattern matching algorithm that I can imagine, applied to some family of patterns (I haven't checked but the structure described at the top of this comment does not look like it'd form an adhesive category. So I'd be careful in trying to use too much intution from DPO for graphs here. I suspect it might be quasiadhesive.) |
Indeed - normally we have an implicit/invisible "copy" node, for dataflow edges only, with a single inport but arbitrary outports; and an implicit/invisible "merge" node for controlflow edges only, with a single outport but arbitrary inports. We could have an implicit "something" node with arbitrary inports and outports, even if it doesn't make sense for any current edge type.
Love this! :) |
These are equivalent (any port is part of exactly one hyperedge), so this is not for correctness, but conceptual neatness? (In which case, yes that works for me!) |
@lmondada is there actually a Hugr API that takes "an edge" (e.g. there is no |
My problem all along can be reformulated by saying that if edges can have more than two ends, then we need to specify which ends of hyper edges are part of a match. It is strictly more general than matching hyperedges (corresponds to matching all ports attached to that edge), but can also specify matches that only apply to a subset of the ports of a hyperedge: think of an output that is copied three times, matching an input boundary that takes two copies. This might be equivalent to matching hyperedges but expanding them into several edges using the commutative Frobenius algebra whenever necessary. |
I have meditated a bit more on this, and it now appears to me that you can make our multiportgraphs form an adhesive category. Consider the category
Then (a candidate for) the category of multiportgraphs is the category of functors |
Nice example! :). I think here I'd have said that the boundary was the unique outport connected to both inports; I guess that'd mean the (hyper)edge was in the pattern, but the source node was not. (Really I've always thought of the pattern containing only nodes, not edges, but that only works with explicit copies - which I'm not against myself.) There are more how-to-define questions here - does a pattern contain its boundary, or is the boundary expressed in terms of a different type of thing (e.g. edge or port) to the things in the pattern (e.g. just nodes, or nodes+edges but not ports).
What if we wanted to argue the rewrite was possible - keeping 'f' but removing 'a' - as this would potentially enable far more rewriting downstream (nodes after the 'a' are now connected directly to those before the 'f') ? (Also, I should stress that we should be thinking in terms of a second pass to select which applicable rewrites to perform, rather than greedily applying anything that matches.) |
Sorry for the late reply. Having taken some time to think about this: I am fine with any solution that works, and @zrho I agree that your formalism is sound. I need to restate my request in broader terms: please give me an API that I can use to define & perform arbitrary rewrites on HUGRs. And in the process please tell me what data I exactly need to keep track of whilst pattern matching.
API design. It seems cumbersome to me to have to provide the subgraph along with the pattern and the embedding for every graph transformation one wants to perform. The naive way would also require the user to create a pattern object in-memory every time, which we should avoid. If you see a neat API for it, then that approach is fine to me :) Hope this clarifies my points. |
I believe defining hugr subgraphs in generality requires a way to uniquely identify Hugr edges. In
SimpleReplacement
we solved the lack of edge IDs by using(Node, IncomingPort)
s as identifiers, which makes the assumption that every incoming port is connected to at most one outgoing port. This is not true for edges in CFG graphs (thanks @zrho for pointing that out), and so the implementation is actually incorrect.Why not use hyper-edges?
@zrho made a good point that for the purposes of pattern matching, all that is required is a way to evaluate predicates of the type "node n1 at port p1 is linked to node n2 at port p2". This does not require unique edge IDs. Instead, could one identify an edge simply by the smallest port it is attached to?
This in effect gives IDs (not to edges but) to hyper-edges: assuming p1 < p2, p3, the two edges below would be assigned the same ID:
This has advantages for pattern matching: in the case of a float wire, to check that both inputs to
n2
andn3
are identical could be done directly, without having to matchn1
. However, by symmetry this would mean that it would also be desirable for the two edges in the following diagram to be part of the same hyper-edge:Should this be transitive? I don't think so... Thinking of the case in which all wires are floats, we'd obtain that
n1 = n4
.More importantly: hyper-edge IDs are not sufficient to define sub-hugrs
It would be desirable that given a Hugr embedding$\phi : P \to H$ obtained from pattern matching, the image of the embedding $\phi(P) \subseteq H$ determines the subgraph of $H$ uniquely. This is not the case if the embedding is defined by its action on nodes and hyper edge IDs. To see why, consider the following two patterns
EDIT: I've now added ports to make the patterns more explicit, but the mermaid rendering is now somewhat confusing. If you look long enough you'll see this is just how tket2 would represent a sequence of rotations (by the same angle) on 1 qubit.
Now consider matching the pattern below against the graph above:
Clearly, pattern 2 also matches in the graph of pattern 1. However, both embeddings (of pattern 1 into pattern 1 and of pattern 2 into pattern 1) are identical, when viewed as embeddings on nodes and hyper-edges. Thus one cannot define a hugr subgraph by maintaining a set of nodes and hyperedges. Additional information is required (such as the exact graph that was matched). Edge IDs would solve this.
The text was updated successfully, but these errors were encountered: