Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search API - Count per object #11280

Open
g-saracca opened this issue Feb 21, 2025 · 5 comments
Open

Search API - Count per object #11280

g-saracca opened this issue Feb 21, 2025 · 5 comments
Labels
GREI Re-arch Issues related to the GREI Dataverse rearchitecture Original size: 10 Size: 10 A percentage of a sprint. 7 hours. SPA These changes are required for the Dataverse SPA Type: Bug a defect

Comments

@g-saracca
Copy link
Contributor

g-saracca commented Feb 21, 2025

What steps does it take to reproduce the issue?
Navigate to https://beta.dataverse.org/spa/collections and open browser dev tools, select the Network tab and filter requests by the word 'search'.

  • What happens?
    The Search API count per object property is returning as 0 the types that are not sent through the endpoint as typequery parameters.
    For example for this API call:
https://beta.dataverse.org/api/v1/search?q=*&show_facets=true&sort=date&order=desc&show_type_counts=true&subtree=root&per_page=10&type=dataverse&type=dataset`

the total_count_per_object_types is returned as:

{
    "Datasets": 60,
    "Dataverses": 40,
    "Files": 0
}

The Files count should not be returned as 0 but as 10,059 (that is the current number on beta) , the correct value of existing files even if the query param type=file is not sent, this is to replicate JSF counting behaviour in the collection page.

  • To whom does it occur (all users, curators, superusers)?
    All SPA Users and API Users

  • What did you expect to happen?
    The API to return the counting on all types even if the type query parameter for that type was not sent.

Which version of Dataverse are you using?
unstable

@g-saracca g-saracca added the Type: Bug a defect label Feb 21, 2025
@g-saracca g-saracca added Size: 10 A percentage of a sprint. 7 hours. SPA These changes are required for the Dataverse SPA GREI Re-arch Issues related to the GREI Dataverse rearchitecture Original size: 10 labels Feb 21, 2025
@qqmyers
Copy link
Member

qqmyers commented Feb 21, 2025

FWIW: The SearchIncludeFragment for the page actually does a second search (always for the first page and one object) to get the file info and (I assume) the file related facets. Given that that's an added cost, we might want a flag to avoid it so that uses of search that aren't drawing a search page can avoid it? (That gives the work-around of just making the second call from the SPA, but since we already have the code for it in Java/in the SearchIncludeFragment, it probably makes sense to keep it back end?)

@g-saracca
Copy link
Contributor Author

@qqmyers I understand.
A query param flag sounds good, but we already added one for returning the count or not: show_type_counts.
So if this could be an expensive extra backend work I think you could only perform it when show_type_counts is true?
In this way we can avoid an extra flag and use the one we already have.

@g-saracca
Copy link
Contributor Author

In case it was not clear, we only need the count of all types for that search, even if they were not sent as type query parameters.
We do not need file information and related facets, or information and related facets of object types not requested in the search because that would show wrong facets. Only the count.

@qqmyers
Copy link
Member

qqmyers commented Feb 21, 2025

Sorry - I was confused by doing https://demo.dataverse.org/dataverse/demo/?q=test which shows the file types facets, but I see it also flips to searching for all three types. Consistent with what you're saying, the line I highlighted has addFacets false.

w.r.t. the show_type_counts flag - is there ever a time when you want, for examples, file counts to be 0 when you're only searching for collections/datasets? If not, it sounds like that flag's effect could be updated.

The query for missing types should certainly be less expensive that the main query so making it probably isn't a big deal, but I still think it would be useful to be able to turn it off, especially for queries for specific fields, spatial queries, etc. where the answer is always zero.

@g-saracca
Copy link
Contributor Author

is there ever a time when you want, for examples, file counts to be 0 when you're only searching for collections/datasets?

No, we need to replicate JSF, and when you select Dataverses and Datasets you can still see the count of Files in the UI. So I don't think so no.

About adding a new flag I leave that to you on what you think is best.
From an API consumer point of view from a frontend client it sounds like it will be a dynamic flag based on the current queries fields, not really easy to do when you have all dynamic fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GREI Re-arch Issues related to the GREI Dataverse rearchitecture Original size: 10 Size: 10 A percentage of a sprint. 7 hours. SPA These changes are required for the Dataverse SPA Type: Bug a defect
Projects
Status: No status
Development

No branches or pull requests

2 participants