Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVPN routes not added when the configuration file is reloaded through the reloade script #17430

Open
2 tasks done
fedepaol opened this issue Nov 14, 2024 · 18 comments
Open
2 tasks done
Labels
triage Needs further investigation

Comments

@fedepaol
Copy link

Description

When FRR starts with a full fledget frr.conf, type 5 routes are available when running show bgp l2vpn evpn.
When starting with an empty configuration file and reloading the configuration file using the reloader script (sorry 😅 ), the routes are not added.

Version

Reproduced both with `FRRouting 10.1.1_git (leaf2) on Linux(5.14.0-427.35.1.el9_4.x86_64).` and with master: `FRRouting 10.3-dev_git20241114 (leaf2) on Linux(5.14.0-427.35.1.el9_4.x86_64).`

How to reproduce

Start frr with an empty configuration, such as:

frr version 10.3-dev_git20241114
frr defaults traditional
hostname leaf2
log file /etc/frr/frr.log
!
debug zebra events
debug zebra kernel
debug zebra rib
debug zebra nht
debug zebra vxlan
debug zebra nexthop
debug bgp keepalives
debug bgp neighbor-events
debug bgp nht
debug bgp updates in
debug bgp updates out
debug bgp zebra
debug bfd peer
debug bfd zebra
debug bfd network

Apply the host configuration to make evpn work:

ip addr add 100.65.0.2/32 dev lo


ip link add red type vrf table 1100
ip link set red up

ip link set eth2 master red
ip addr add 192.168.11.2/24 dev eth2

ip link add br100 type bridge
ip link set br100 master red addrgenmode none
ip link set br100 addr aa:bb:cc:00:00:64
ip link add vni100 type vxlan local 100.65.0.2 dstport 4789 id 100 nolearning
ip link set vni100 master br100 addrgenmode none
ip link set vni100 type bridge_slave neigh_suppress on learning off
ip link set vni100 up
ip link set br100 up

Use the reloader script to load a file like:

vrf red
 vni 100
exit-vrf
!
router bgp 64512
 no bgp ebgp-requires-policy
 no bgp network import-check
 no bgp default ipv4-unicast

 neighbor 192.168.1.2 remote-as 64612
 neighbor 192.168.1.2 allowas-in origin
 !
 address-family ipv4 unicast
  neighbor 192.168.1.2 activate
  network 100.65.0.2/32
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor 192.168.1.2 activate
  neighbor 192.168.1.2 allowas-in origin
  advertise-all-vni
  advertise-svi-ip
 exit-address-family
exit
!
router bgp 64512 vrf red
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family ipv6 unicast
  redistribute static
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast
  advertise ipv6 unicast
 exit-address-family
exit
!

Run show bgp l2vpn evpn

Expected behavior

Any connected or bgp learned route is redistributed as type5 routes and available in the evpn summary, like it happens when frr starts with the full file:

BGP table version is 2, local router ID is 100.65.0.2
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[EthTag]:[ESI]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 192.168.11.2:2
 *>  [5]:[0]:[24]:[192.168.11.0]
                    100.65.0.2               0         32768 ?
                    ET:8 RT:64512:100 Rmac:aa:bb:cc:00:00:64

Actual behavior

show bgp l2vpn evpn                       
No prefixes displayed, 0 exist

Additional context

Don't believe it's relevant, but I am running this inside a container.
When not using master I can repro this only using the same asn for the vrf-ed router as I am hitting #16152

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@fedepaol fedepaol added the triage Needs further investigation label Nov 14, 2024
@ton31337
Copy link
Member

Grh, frr-reload-foobar :) Could you show frr-reload.py --debug ... output?

@fedepaol
Copy link
Author

It doesn't say anything relevant:

leaf2:/# python /usr/lib/frr/frr-reload.py --debug --reload /tmp/frr.conf.differentasn
[114|mgmtd] sending configuration
[115|zebra] sending configuration
[118|ospfd] sending configuration
[120|ldpd] sending configuration
[121|bgpd] sending configuration
[115|zebra] done
[118|ospfd] done
[129|watchfrr] sending configuration
[120|ldpd] done
[131|staticd] sending configuration
[114|mgmtd] done
[132|bfdd] sending configuration
Waiting for children to finish applying config...
[129|watchfrr] done
[131|staticd] done
[132|bfdd] done
[121|bgpd] done
[139|mgmtd] sending configuration
[140|zebra] sending configuration
[143|ospfd] sending configuration
[145|ldpd] sending configuration
[146|bgpd] sending configuration
[140|zebra] done
MGMTD: No changes found to be committed!
[139|mgmtd] done
[143|ospfd] done
[145|ldpd] done
[154|watchfrr] sending configuration
[156|staticd] sending configuration
[157|bfdd] sending configuration
Waiting for children to finish applying config...
[154|watchfrr] done
[146|bgpd] done
[156|staticd] done
[157|bfdd] done

Also, the output of show running-conf seems consistent with what I am passing:


vrf red
 vni 100
exit-vrf
!
router bgp 64512
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast
 no bgp network import-check
 neighbor 192.168.1.2 remote-as 64612
 !
 address-family ipv4 unicast
  network 100.65.0.2/32
  neighbor 192.168.1.2 activate
  neighbor 192.168.1.2 allowas-in origin
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor 192.168.1.2 activate
  neighbor 192.168.1.2 allowas-in origin
  advertise-all-vni
  advertise-svi-ip
 exit-address-family
exit
!
router bgp 64512 vrf red
 !
 address-family ipv4 unicast
  redistribute connected
 exit-address-family
 !
 address-family ipv6 unicast
  redistribute static
 exit-address-family
 !
 address-family l2vpn evpn
  advertise ipv4 unicast
  advertise ipv6 unicast
 exit-address-family
exit
!
end

@fedepaol
Copy link
Author

@ton31337 also note that this is not a sporadic but consistent behaviour, so it can be reproduced locally

@fedepaol
Copy link
Author

(but of course if you need me to add parameters just ask, I put together a quick repro before reporting this)

@ton31337
Copy link
Member

@fedepaol is this enough to reproduce with a single instance (no established sessions)?

@fedepaol
Copy link
Author

fedepaol commented Nov 16, 2024

@fedepaol is this enough to reproduce with a single instance (no established sessions)?

yes, my very basic reproducer is a clab instance with a non - frr container connected to a frr container (which serves for the connected routes). If needed I can share it

@ton31337
Copy link
Member

Looks like I'm able to reproduce it (I think so, still doing more tests)...

@ton31337
Copy link
Member

UPDATE: seems not, there was a typo in the configuration. Please send the logs of all these debug statements you have enabled. And also, could you show "show ip bgp", "show ip route", "show ip bgp vrf red", "show ip route vrf red"?

@fedepaol
Copy link
Author

It's happeing with master too. I'll provide everything, and will possibly push the reproducer too.

@fedepaol
Copy link
Author

leaf2# show ip bgp
BGP table version is 1, local router ID is 100.65.0.2, vrf id 0
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
 *>  100.65.0.2/32    0.0.0.0                  0         32768 i

Displayed 1 routes and 1 total paths
leaf2# show ip route
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/0] via 172.20.20.1, eth0, weight 1, 00:35:20
L * 100.65.0.2/32 is directly connected, lo, weight 1, 00:35:20
C>* 100.65.0.2/32 is directly connected, lo, weight 1, 00:35:20
C>* 172.20.20.0/24 is directly connected, eth0, weight 1, 00:35:20
L>* 172.20.20.2/32 is directly connected, eth0, weight 1, 00:35:20
leaf2# show ip bgp vrf red
BGP table version is 1, local router ID is 192.168.11.2, vrf id 2
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, u unsorted, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

     Network          Next Hop            Metric LocPrf Weight Path
 *>  192.168.11.0/24  0.0.0.0                  0         32768 ?

Displayed 1 routes and 1 total paths
leaf2# show ip route vrf red
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF red:
C>* 192.168.11.0/24 is directly connected, eth2, weight 1, 00:36:06
L>* 192.168.11.2/32 is directly connected, eth2, weight 1, 00:36:06

@fedepaol
Copy link
Author

frr.log.gz

@ton31337
Copy link
Member

Looking at the logs, which time is once you did frr-reload.py stuff?

@fedepaol
Copy link
Author

I don't remember 😅 BUT here is another one I just did. Reload happens at 09:53
frr.log.gz

@chiragshah6
Copy link
Member

Can you check show bgp l2vpn evpn vni 100 , show bgp vrf red ipv4 unicast 192.168.11.0/24 output

@chiragshah6
Copy link
Member

I tried similar test to originate local static route from tenant vrf into EVPN as Type-5, performed frr-reload on saved script and do see locally originated Type-5 route present.

I used tests/topotests/bgp_evpn_rt5 to capture below output.

r1(config)# vrf r1-vrf-101
r1(config-vrf)# ip route 8.4.2.0/24 blackhole

r1(config)# router bgp 65000
r1(config-router)# router bgp 65000 vrf r1-vrf-101
r1(config-router)#
r1(config-router)# address-family ipv4 unicast
r1(config-router-af)# no network 192.168.102.21/32
r1(config-router-af)# redistribute static

r1# show bgp l2vpn evpn route 
BGP table version is 7, local router ID is 192.168.100.21
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[EthTag]:[ESI]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 192.168.102.21:2
 *>  [5]:[0]:[24]:[8.4.2.0]
                    192.168.100.21           0         32768 ?
                    ET:8 RT:65000:101 Rmac:96:c6:c5:28:0f:2d

Displayed 1 prefixes (1 paths)
r1# 


r1# wr
Note: this version of vtysh never writes vtysh.conf

Warning: attempting direct configuration write without watchfrr.
File permissions and ownership may be incorrect, or write may fail.

Building Configuration...
Integrated configuration saved to /etc/frr/frr.conf
[OK]


root@r1:/tmp/topotests/bgp_evpn_rt5.test_bgp_evpn/r1# /usr/lib/frr/frr-reload.py --reload --debug /etc/frr/frr.conf
[3551|mgmtd] sending configuration
[3552|zebra] sending configuration
Waiting for children to finish applying config...
[3558|bgpd] sending configuration
[3558|bgpd] done
[3551|mgmtd] done
[3568|staticd] sending configuration
[3552|zebra] done
[3568|staticd] done
[3576|mgmtd] sending configuration
[3577|zebra] sending configuration
Waiting for children to finish applying config...
[3593|staticd] sending configuration
[3583|bgpd] sending configuration
MGMTD: No changes found to be committed!
[3576|mgmtd] done
[3593|staticd] done
[3577|zebra] done
[3583|bgpd] done



root@r1:/tmp/topotests/bgp_evpn_rt5.test_bgp_evpn/r1# vtysh -c "show bgp l2vpn evpn route"
BGP table version is 7, local router ID is 192.168.100.21
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-1 prefix: [1]:[EthTag]:[ESI]:[IPlen]:[VTEP-IP]:[Frag-id]
EVPN type-2 prefix: [2]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
EVPN type-4 prefix: [4]:[ESI]:[IPlen]:[OrigIP]
EVPN type-5 prefix: [5]:[EthTag]:[IPlen]:[IP]

   Network          Next Hop            Metric LocPrf Weight Path
                    Extended Community
Route Distinguisher: 192.168.102.21:2
 *>  [5]:[0]:[24]:[8.4.2.0]
                    192.168.100.21           0         32768 ?
                    ET:8 RT:65000:101 Rmac:96:c6:c5:28:0f:2d

@fedepaol
Copy link
Author

leaf2# show bgp l2vpn evpn vni 100
VNI: 100 (known to the kernel)
  Type: L3
  Tenant VRF: red
  RD: 192.168.11.2:2
  Originator IP: 100.65.0.2
  MAC-VRF Site-of-Origin:
  Advertise-gw-macip : n/a
  Advertise-svi-macip : n/a
  Advertise-pip: Yes
  System-IP: 100.65.0.2
  System-MAC: aa:bb:cc:00:00:64
  Router-MAC: aa:bb:cc:00:00:64
  Import Route Target:
    64512:100
  Export Route Target:
    64512:100
show bgp vrf red ipv4 unicast 192.168.11.0/24
BGP routing table entry for 192.168.11.0/24, version 1
Paths: (1 available, best #1, vrf red)
  Not advertised to any peer
  Local
    0.0.0.0 from 0.0.0.0 (192.168.11.2)
      Origin incomplete, metric 0, weight 32768, valid, sourced, best (First path received)
      Last update: Fri Nov 22 08:28:34 2024

@fedepaol
Copy link
Author

if it helps, I put together a cramped up version of the reporducer here https://github.com/fedepaol/frrrepro, but it containerlab based.

Steps are:

  • run setup.sh
  • run ./reload.sh frr.conf.differentasn
  • docker exec -it clab-frrrepro-leaf2 vtysh to wander around

hope it helps

@ton31337
Copy link
Member

@chiragshah6 what is interesting is that with this config (topotest, you can grab it also) https://github.com/FRRouting/frr/compare/master...opensourcerouting:frr:fix/issue_17430?expand=1, I don't see route type-5 also 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants