-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRABServer REST should not talk to CRIC #6917
Comments
we just had a small storm of CRABServer failures over 1h, all due to errors in talking to CRIC. |
first thing should be to make this capable to deal with wildcards in the site list CRABServer/src/python/TaskWorker/Actions/DagmanCreator.py Lines 777 to 781 in 2ed6f59
then will worry about modifying REST to pass the list with the |
current code in REST relies on
CRABServer/src/python/CRABInterface/Utilities.py Lines 172 to 176 in 2ed6f59
(By the way that's hard to understand since WMCore CRIC class already has a 1h default cache inside... oh well...) TaskWorker actions are done in independente processes, so it woule make sense to reuse
current caching reduces number of calls to external service, but makes things fail miserably if server is down when cache expires. Should definitely combine with the call to CRIC in DataDiscovery and have a single cache file, ref. #6946 Or at least a common access method with the refresh+use-stale policy. |
Let's move info from #6946 inside here, to simplify tracking
|
maybe a good topic for @mapellidario next month ? I am not comfortable with named ntuple and decorators (see
|
On Hold waiting for decision on who will work on it |
I will need some guidance (as always), but I will happily take this! |
thanks Dario. Certainly ! Let's assume that we can get to this sometimes in March |
we forgot to remove onhold label, doing it now |
Regarding the #8937 changes. Summary
(See also test results [1][2][3]) Test Results[1] When user submit invalid site (e.g. T2_TH_Bangkok) or glob pattern in either whitelist or blacklist to expand (e.g. T2_TH_*). Instead of response right away to user with CherryPy 400 Bad Request from CRABServer as following.
Now, It reject LATER by TaskWorker with SUBMITFAILED task status as following.
[2] As well as, when user submit banned Storage Site regarding
[3] Test output shows that CRIC requests have not only been cached.
But The new caching also mitigate flooding of CRIC debugging log from WMCore that we have to temporary turn-on-and-off using context manager: |
NOTES: regarding #8937 (comment) extensive changes than we expected. Problem Definition:Throughout, TaskWorker code base, every time we wish to fetch CRIC-related sites info in any TaskAction. We've to instantiate CRIC(logger=self.logger, configDict=configDict) within either envForCMSWEB or tempSetLogLevel context manager, which leave us with kind of boilerplate code in many places as we can see in [1][2][3][4]. [1] CRABServer/src/python/TaskWorker/Actions/DagmanCreator.py Lines 911 to 915 in 2283a7b
[2] CRABServer/src/python/TaskWorker/Actions/DataDiscovery.py Lines 37 to 41 in 2283a7b
[3] CRABServer/src/python/TaskWorker/Actions/UserDataDiscovery.py Lines 35 to 38 in 2283a7b
[4] CRABServer/src/python/TaskWorker/Actions/MakeFakeFileSet.py Lines 19 to 31 in 2283a7b
Solution:As you might know, what i'm trying to do is Dependency Injection, weak form of Inversion of Control (IoC), like we did with config [WMCore.Configuration] on higher context manager (e.g. Actions/Handler.py). But not overdo it until becoming web frameworks like Angular, Laravel and Nestjs.
Clean Code Principles that we embrace/emphasize here:
🥹 I'm by no means an expert at this. used to have discipline once in a while but wish to be disciplined Boy Scout someday! Of course! not overdo it and break things are number one priority! |
In response to Stefano's review, I've split this PR #8937 into 2 stages as following.
|
better to avoid talking to any external service from the REST
supporting proper authentication and debugging problems is too much of a pain
IIRC this access is only used to fill the site whitelist for MC with "all sites".
Which in principle can be done in TW.
I looked at this some time ago and change seemed too big to be worth.
But now I think that it is worth the effort.
@mapellidario FYI
The text was updated successfully, but these errors were encountered: