Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External Search Refactor #11281

Draft
wants to merge 14 commits into
base: develop
Choose a base branch
from

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented Feb 21, 2025

What this PR does / why we need it: This is a draft PR demonstrating the ability to have several configurable search engines. It's primarily a proof-of-concept of the general design that's been discussed, but, with the example ExternalSearchServiceBean and PostExternalSearchServiceBean classes it could already support developing/demonstrating a separate tool. Those beans send all of the search parameters from Dataverse to a third party web service (via GET or POST) and expect to get back an ordered JsonArray of matching entityIds. They will then send a query to Solr via Dataverse's internal SolrSearchServiceBean that will generate the output needed for the display. (Since Solr doesn't preserve the order in the query, these beans reorder the results.)

Which engine is used is controlled by a Mprofile/jvm option: -Ddataverse.search.default-service=

where name is 'externalSearch', 'postExternalSearch', etc. as defined in the classes.

Both of the external classes also rely on a new/temporary :ExternalSearchUrl parameter (easier to change than a JvmSetting)

The ExternalSearchServiceBean has been 'tested' with :ExternalSearchUrl pointed at a static page returning a list of valid entityIds - can see in the logs that all the search params are sent as query params, can verify that final the search results contain only those entities, that the order is the same as what's sent, etc. The POST version has not been tested since there's no service setup to receive a POST.

There are other example beans: goldenOldies that only returns entries with entityIds <1000, and oddlyEnough that only returns entityIds that are odd. Neither of these is practical, but they allow some testing w/o any external service and, for oddlyEnough, show how one can deal partially deal with paging, changing the number of total results, etc. that come from solr.

Which issue(s) this PR closes:

  • Closes #

Special notes for your reviewer:

The ability to find/enable SearchService implementations packaged in separate jars is coded but has not been tested.
If only the existing Solr engine needs to be a bean, this code could be much more like the code for PidProviders and exporters (which are known to work).

The SolrSearchServiceBean is currently a @singleton. I changed while testing other things and am not sure if it is still needed or if it can be @stateless again.

Suggestions on how to test this:

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@coveralls
Copy link

coveralls commented Feb 21, 2025

Coverage Status

coverage: 22.699% (-0.04%) from 22.736%
when pulling 036be4e on GlobalDataverseCommunityConsortium:Search_Refactor
into 2210d16 on IQSS:develop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants