Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The local n_lcores configuration of gt.lua exceeds 2, resulting in policy matching exception. #651

Closed
ShawnLeung87 opened this issue Jul 25, 2023 · 15 comments

Comments

@ShawnLeung87
Copy link

ShawnLeung87 commented Jul 25, 2023

when grantor
gt.lua is configured with local n_lcores = 4, reload_policy is unsuccessful, and there is no error message or success message. Policy matches to nil.
When the configuration is local n_lcores = 2, we can reload_policy, there is a prompt of success, and the policy matches successfully

@AltraMayor
Copy link
Owner

Could you post the log entries that are added when n_lcores = 4 and a policy reload is issued?

@ShawnLeung87
Copy link
Author

Before 17:28, 4 cores were configured.
The log does not have "Successfully updated the Lua state". The policy is not updated.
After 17:28, 2 lcores are configured, and the logs are
"Successfully updated the Lua state", the policy update was successful.

[2023-07-24 17:13:07] GT/3: Successfully updated the Lua state incrementally
[2023-07-24 17:13:07] GT/6: Successfully updated the Lua state incrementally
[2023-07-24 17:13:07] GT/5: Successfully updated the Lua state incrementally
[2023-07-24 17:13:07] GT/4: Successfully updated the Lua state incrementally
[2023-07-24 17:28:51] GT/4: Successfully updated the Lua state
[2023-07-24 17:29:10] GT/3: Successfully updated the Lua state
[2023-07-24 17:30:49] GT/3: Successfully updated the Lua state incrementally
[2023-07-24 17:30:49] GT/4: Successfully updated the Lua state incrementally
[2023-07-24 17:33:54] GT/3: Successfully updated the Lua state incrementally
[2023-07-24 17:33:54] GT/4: Successfully updated the Lua state incrementally
[2023-07-24 17:35:37] GT/4: Successfully updated the Lua state
[2023-07-24 17:35:56] GT/3: Successfully updated the Lua state

@ShawnLeung87
Copy link
Author

capture this log
gt: failed to allocate new lua state to GT block 0 at lcore 4

@ShawnLeung87
Copy link
Author

lua_state = luaL_newstate();
if (lua_state == NULL) {
	G_LOG(ERR, "Failed to create new Lua state at %s\n",
		__func__);
	goto out;
}

@AltraMayor
Copy link
Owner

[2023-07-24 17:13:07] GT/3: Successfully updated the Lua state incrementally

What are you updating incrementally?

@AltraMayor
Copy link
Owner

capture this log gt: failed to allocate new lua state to GT block 0 at lcore 4

Which version of Gatekeeper are you running?

What are the largest data structures in your policy?

@AltraMayor
Copy link
Owner

lua_state = luaL_newstate();
if (lua_state == NULL) {
	G_LOG(ERR, "Failed to create new Lua state at %s\n",
		__func__);
	goto out;
}

You copied the snipped above from file gt/main.c, correct?

Have you found this log entry in your log file as well? It doesn't match the log entry gt: failed to allocate new lua state to GT block 0 at lcore 4 you mentioned before. I don't understand your point here.

@ShawnLeung87
Copy link
Author

“failed to allocate new lua state to GT block 0 at lcore 4”
This log is null returned by alloc_and_setup_lua_state
Now the version I use is master's 1.1.0.
There is a strange phenomenon that in my test environment, using a 10G NIC, this error has appeared several times. It doesn't appear later. However, 40GNICs in the production environment have always appeared. The code, data volume and strategy are all the same. The system kernel is different. The kernel used in the production environment is 5.4.0-131-generic, and the kernel used in the test environment is 5.4.0-125-generic.

@ShawnLeung87
Copy link
Author

I have created more than 6000 policies in lua table. #637 mentioned that the policy consumes 1M memory. Is this a single policy consumption? Or is it the total consumption of policies?

@ShawnLeung87
Copy link
Author

The problem has been found. The reason is that I designed a global and regional lpm table, enter the global ip, and distinguish the region by id. The global ip txt file is 31MB, and lua jit can only be fully loaded under 2 lcores. 4 lcore lua jit failed to reload because lua jit has insufficient memory space. Finally back to the issue #637, need to upgrade to lua jit 2.1, may solve this problem.

@ShawnLeung87
Copy link
Author

This problem will not occur if the data entered into the lpm table is halved. If it is global complete ip data, this problem will occur. I am in the production and test environments, and the test results are consistent.

@ShawnLeung87
Copy link
Author

This ip file has 1.6 million pieces of data.Doesn't luajit 2.0 support so much data?

@AltraMayor
Copy link
Owner

If your Lua policy uses lpmlib.new_lpm() and/or lpmlib.new_lpm6() for the list of network prefixes, Lua holds very little of the memory of these LPM tables; most of the memory of these tables is allocated in hugepages, which does not interfere with LuaJIT 2.0's memory. If you create Lua tables (e.g. t = {}), all this memory is under LuaJIT's management.

Could you run the script lua_memory.lua with the command gkctl to check the memory allocated in your Lua policy? This script requires 'policylib' to be a global variable in your policy.

@ShawnLeung87
Copy link
Author

If your Lua policy uses lpmlib.new_lpm() and/or lpmlib.new_lpm6() for the list of network prefixes, Lua holds very little of the memory of these LPM tables; most of the memory of these tables is allocated in hugepages, which does not interfere with LuaJIT 2.0's memory. If you create Lua tables (e.g. t = {}), all this memory is under LuaJIT's management.

Could you run the script lua_memory.lua with the command gkctl to check the memory allocated in your Lua policy? This script requires 'policylib' to be a global variable in your policy.

Yes, there is a lua function, which uses lua table to save these ip data.I will modify this function and test it later.

@ShawnLeung87
Copy link
Author

Good news, after modifying the lua function, the 4 lcores can load these data normally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants