Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Juniper "Unaligned memory access" on vlan change #8196

Open
hrodenburg opened this issue Jun 30, 2024 · 5 comments
Open

Juniper "Unaligned memory access" on vlan change #8196

hrodenburg opened this issue Jun 30, 2024 · 5 comments
Assignees

Comments

@hrodenburg
Copy link

Describe the bug

In short:

When Packetfence issues a vlan change on a port (after registering/deregistering), the switch logs an error and the vlan won't be changed for the switch port. After manual unplug/plug the port, the correct vlan is applied.

The details:

The error being logged by the switch:

/kernel: Unaligned memory access by pid 1297 [authd] at bff7cafa PC[460f4c]

I found an old post which exactly describes the same problem: https://packetfence-users.narkive.com/jZiUjeMf/issue-with-vlan-changes-on-juniper-ex-switch

Switch hardware:

  • Juniper EX2200-C-12P-2G
  • Junos version: 12.3R12-S15
  • The switch is configured in packetfence as type "Junos v12.x".

Software:

  • Packetfence version: 13.2.0+20240625122827+1351629202+0011+maintenance~13~2+bullseye1 (latest as of now).
  • OS: Debian 11

Additional context
I'm in the process of upgrading my Packetfence installation. On the old server, which is running version 10.3, this is working without any issues.

I compared the switch templates between the 2 packetfence versions. And while the structure has been changed a bit, the only difference I could find is:

$ diff Junos.pm.bak Junos.pm
142c142,146
<         my $connection_info = $self->radius_deauth_connection_info($send_disconnect_to);
---
>         my $connection_info = {
>             nas_ip => $send_disconnect_to,
>             secret => $self->{'_radiusSecret'},
>             LocalAddr => $self->deauth_source_ip($send_disconnect_to),
>         };

I patched Junos.pm so it would be exactly the same as in Packetfence 10.3, but that made no difference. So it seems the problem is located somewhere else (radius?).

Please let me know when information is missing, and any help will be much appreciated.
Thanks!

@fdurand
Copy link
Member

fdurand commented Jul 3, 2024

In order to see what happen, can you compare the deauth between the old code and the new one ?

radsniff -i any -f "port 3799" -X

@hrodenburg
Copy link
Author

Hi,

Thanks for your response. I didn't thought of capturing the radius response (and I was not aware of the radsniff utility).
As suggested I captured both responses:

On the new version (13.2):

hugo@pf1:~$ sudo radsniff -i any -f "port 3799" -X
Logging all events
Sniffing on (any)
2024-07-04 22:33:52.795608 (1) Disconnect-Request Id 15 any:172.29.1.69:40468 -> 10.12.45.74:3799 +0.000
        NAS-IP-Address = 10.12.45.74
        Calling-Station-Id = "ec:8e:b5:a7:b5:09"
        Authenticator-Field = 0x98db7a6fce580d8b5ad2ab0531088c59
2024-07-04 22:33:52.921755 (2) Disconnect-NAK Id 15 any:172.29.1.69:40468 <- 10.12.45.74:3799 +0.126 +0.126
        Error-Cause = Missing-Attribute
        Authenticator-Field = 0x345a3183eaa7bb7d1779ac8d000c3403
2024-07-04 22:33:58.121755 (1) Cleaning up request packet ID 15

And on the old version (10.3):

hugo@pf1:~$ sudo radsniff -i any -f "port 3799" -X
Logging all events
Sniffing on (any)
2024-07-04 22:40:58.756540 (1) Disconnect-Request Id 233 any:172.29.1.69:41290 -> 10.12.45.74:3799 +0.000
        NAS-IP-Address = 10.12.45.74
        Calling-Station-Id = "ec:8e:b5:a7:b5:09"
        Acct-Session-Id = "8O2.1x8111001d000207eb"
        Authenticator-Field = 0xc7476a9cd3e3fbc88a703e14b5bd6862
2024-07-04 22:40:58.283715 (2) Disconnect-ACK Id 233 any:172.29.1.69:41290 <- 10.12.45.74:3799 +0.208 +0.208
        Authenticator-Field = 0x2db1f8aa6de8ff311aea332c8e6cd7bf
2024-07-04 22:41:03.483715 (1) Cleaning up request packet ID 233

So it looks like it's missing the Acct-Session-Id field. I tried finding out how this could be added, but I'm even not sure this can be done through packetfence's configuration, or that this has to be added to the radius config, switch templates etc.

Another thing I noticed (and might be related, but I ignored this until now) is that Packetfence is unable to determine whether a node is online or offline. It just shows an yellow dot in the node list instead of green or red (status column). Could it be that there is a more general issue in accounting in my setup which causes both issues? pfacct seems to be running without issues though (I presume this service is responsible for accounting).

Any guidance in solving this will be much appreciated. Thanks in advance.

@hrodenburg
Copy link
Author

Hello @fdurand,

I'm a bit uncomfortable in mentioning you like this, so sorry for that, but I would really appreciate your thoughts on this issue. Is there anything I can do myself to get this going or troubleshoot further?

In the meantime I was able to test with a Juniper switch which is running Junos version 15 (and this version has a different template within Packetfence), but the exact same issue occurs.

Thanks again,
Hugo

@fdurand
Copy link
Member

fdurand commented Sep 12, 2024

Hello sorry for the delay on that issue.
Is it possible to give me the switches.conf ?
I am not able to replicate and in the disconnect request i have the Acct-Session-Id.

@fdurand fdurand self-assigned this Sep 12, 2024
@hrodenburg
Copy link
Author

Hello fdurand,

Thanks for getting back on this. And very strange it doesn't happen on your side. I remember I tried setting up an clean system to test this with a minimal configuration, but I ran into issues with the initial installation of Packetfence (and did not understand why, but the installer seemed broken at that point). I do see that in the meantime Packetfence 14 has been released, so I can give that a try.

My switches.conf:

# Copyright (C) Inverse inc.
#
#
#
# See the enclosed file COPYING for license information (GPL).
# If you did not receive this file, see
# http://www.fsf.org/licensing/licenses/gpl.html
[default]
description=default
registrationVlan=41
isolationVlan=42
inlineVlan=44
#
# SNMP section
#
# PacketFence -> Switch
SNMPVersion=2c
#
# RADIUS NAS Client config
#
# RADIUS shared secret with switch
radiusSecret=<secret>
adminsVlan=45
mdwVlan=47
dmxVlan=48
crewVlan=43
securityVlan=49
guestVlan=46
defaultVlan=41
video3Vlan=52

[10.12.45.60]
group=junex4200switches
description=<removed>

[10.12.45.67]
useCoA=Y
group=junex4200switches
description=<removed>

[10.12.45.71]
useCoA=Y
group=junex2200switches
description=<removed>
uplink=21,22,23,24,25,26

[group junex2200switches]
description=Juniper EX2200 switches
deauthMethod=RADIUS
uplink=13,14
uplink_dynamic=static
type=Juniper::Junos
radiusDeauthUseConnector=N

[group junex4200switches]
description=Juniper EX4200 switches
deauthMethod=RADIUS
uplink_dynamic=static
type=Juniper::Junos_v15_x
useCoA=N
radiusDeauthUseConnector=N

A few notes:

  • I removed some sensitive information like secrets and switch descriptions
  • There are more switches in each group, but they all have the exact same configuration (except for ip and description offcourse)
  • There is actually another group present for unifi access points, but I don't see relevance for this issue
  • I noticed there are defined uplink definitions, which does not make sense, because these are switches with only 14 interfaces. This might be inherited from an older config/setup. Not sure if that would cause any harm though.

Thanks again!
Hugo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants